from:"Matthieu Monrocq"

Re: [rust-dev] Optimization removes checks

2019-10-03 Thread Matthieu Monrocq

Hello Kamlesh,

This mailing list is more-or-less dead; please consider asking your
questions on https://users.rust-lang.org/.

Regards

On Thu, Oct 3, 2019 at 5:37 AM kamlesh kumar 
wrote:

> why does optimization removes overflow checks?
> consider below testcase
> $cat test.rs
> fn fibonacci(n: u32) -> u32 {
> let mut f:u32 = 0;
> let mut s:u32 = 1;
> let mut next: u32 =f+s;
> for _ in 1..n {
> f = s;
> s= next;
>next=f+s;
> }
> next
> }
> fn main() {
> println!("{}",fibonacci(100));
> }
>
> $rustc test.rs -C opt-level=1
> $./test
> 2425370821
>
> $rustc test.rs -C opt-level=0
> $./test
> thread 'main' panicked at 'attempt to add with overflow', p11.rs:11:7
> note: run with `RUST_BACKTRACE=1` environment variable to display a
> backtrace.
>
> ./Kamlesh
> ___
> Rust-dev mailing list
> Rust-dev@mozilla.org
> https://mail.mozilla.org/listinfo/rust-dev
>
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Bare-metal Rust linking with C static library

2015-05-30 Thread Matthieu Monrocq

Hello Eric,

Please note that the rust-dev list is (for better or worse) abandonned.

You may ask questions on either IRC (
https://chat.mibbit.com/?server=irc.mozilla.orgchannel=%23rust), the users
forum (https://users.rust-lang.org/) or StackOverflow.

You may also ask on Reddit (https://reddit.com/r/rust), however it's more
used for announcements than questions in general.

Note that all the community links I gave are accessible directly from
http://www.rust-lang.org/

Good luck with your project!


On Sat, May 30, 2015 at 8:49 PM, Eric Stutzenberger 
dynamicstabil...@gmail.com wrote:

 I'm working on building out a Rust interface to the nRF51x series parts.
 I have a bare metal system working quite well.  The nRF51x has a bluetooth
 stack (called Softdevice).  This stack requires the use of supervisor calls
 to request the stack to perform certain functions.  My plan is to write a C
 library wrapper around these service calls, compile with arm-none-wabi-gcc
 and then link this to my Bare-metal rust system.  A large chunk of the work
 I have done thus far is based off of STM32 example work done by Jorge
 Aparicio (https://github.com/japaric).

 Since I have the basics up and running, I am working on trying to get a C
 static library built with arm-none-eabi-gcc and archived with
 arm-none-eabi-ar to properly link in with my Rust code.

 I have the following (very basic) .c file:
 uint32_t sum(uint32_t a, uint32_t b)
 {
 return a + b;
 }

 I am compiling and linking with the following commands:

 arm-none-eabi-gcc -Wall -mcpu=cortex-m0 -mthumb -fPIC --specs=nosys.specs
 -shared test.c -o test.o

 arm-none-eabi-ar -rs libtest.a test.o


 In my rust file:

 [link(name=test, kind=static)]

 extern {

 pub fn sum(a: u32, b: u32) - u32;

 }


 I then invoke it as a test:

 pub fn main() {

 let test_sum = unsafe { sum(2, 3) };

 }


 I am using a Makefile to execute the rust compiler for some specific
 arguments, such as my specific target:

 # rustc target

 TARGET = thumbv6m-none-eabi


 # toolchain prefix

 TRIPLE = arm-none-eabi


 APP_DIR = src/app

 OUT_DIR = target/$(TARGET)/release


 DEPS_DIR = $(OUT_DIR)/deps


 BINS = $(OUT_DIR)/%.hex

 HEXS = $(OUT_DIR)/%.hex

 ELFS = $(OUT_DIR)/%.elf

 OBJECTS = $(OUT_DIR)/intermediate/%.o

 SOURCES = $(APP_DIR)/%.rs


 APPS = $(patsubst $(SOURCES),$(BINS),$(wildcard $(APP_DIR)/*.rs))


 RUSTC_FLAGS := -C lto -g $(RUSTC_FLAGS)


 # don't delete my elf files!

 .SECONDARY:


 all: rlibs  $(APPS)


 clean:

 cargo clean


 # TODO $(APPS) should get recompiled when the `rlibs` change


 $(OBJECTS): $(SOURCES)

 mkdir -p $(dir $@)

 rustc \

 $(RUSTC_FLAGS) \

 --crate-type staticlib \

 --emit obj \

 --target $(TARGET) \

 -L $(DEPS_DIR) \

 -L ../sd110_lib \

 --verbose \

 -o $@ \

 -ltest \

 $


 $(ELFS): $(OBJECTS)

 $(TRIPLE)-ld \

 --gc-sections \

 -T layout.ld \

 -o $@ \

 $

 #size $@


 $(BINS): $(ELFS)

 $(TRIPLE)-objcopy \

 -O ihex \

 $ \

 $@


 rlibs:

 cargo build --target $(TARGET) --verbose --release


 The cargo.toml is as follows:

 [package]

 name = bmd200eval

 version = 0.1.0

 authors = [Eric Stutzenberger eric.stutzenber...@rigado.com]


 [dependencies.nrf51822]

 path = ../nrf51822.rs


 When I run make, I get the following output:

 .

 .

 .

 mkdir -p target/thumbv6m-none-eabi/release/intermediate/

 rustc \

 -C lto -g  \

 --crate-type staticlib \

 --emit obj \

 --target thumbv6m-none-eabi \

 -L target/thumbv6m-none-eabi/release/deps \

 -L ../sd110_lib \

 --verbose \

 -o target/thumbv6m-none-eabi/release/intermediate/blink.o \

 -ltest \

 src/app/blink.rs

 src/app/blink.rs:42:9: 42:17 warning: unused variable: `test_sum`,
 #[warn(unused_variables)] on by default

 src/app/blink.rs:42 let test_sum = unsafe { sum(2, 3) };

 ^~~~

 arm-none-eabi-ld \

 --gc-sections \

 -T layout.ld \

 -o target/thumbv6m-none-eabi/release/blink.elf \

 target/thumbv6m-none-eabi/release/intermediate/blink.o

 target/thumbv6m-none-eabi/release/intermediate/blink.o: In function
 `blink::main':

 git/rust-nrf/bmd200eval.rs/src/app/blink.rs:42: undefined reference to
 `sum'


 I have found numerous different references to linking Rust with C and
 calling C from Rust but I haven't found a specific answer as to why this
 will not link.  As you can see in the Makefile, I have tried to
 force rustc's hand in finding and linking against the library, but this
 doesn't seem to make a difference.


 Is there an issue with how I am building a library?

 Since rustc is generating a staticlib in this case, is there some
 different method that needs to be used?


 Note that I am avoiding the Clang compiler for the moment due to the
 following:


 https://devzone.nordicsemi.com/question/29628/using-clang-and-the-s110-issues-with-supervisor-calls-to-the-softdevice/


 Essentially, the gist of the above is that Clang is not quite producing
 the correct supervisor assembly code for calling in to the bluetooth stack,

Re: [rust-dev] is rust an 'oopl'?

2015-01-12 Thread Matthieu Monrocq

 into the situation where you want to have
 an instance of a type stored in more than one place? Well, you have
 two options. If the type supports Clone you can call the clone
 method and produce a duplicate. The exact way clone works is very
 specific to the type. It might create a completely separate type or
 the two might still be linked. Do not worry at the moment as this
 will become evident as you learn Rust.

 Just keep in mind that for non-copyable types or types in which you
 do not want a copy you can create a smart-pointer to manage them:

 let pointer = Rc::new(myothervar);

 let secondhome = pointer.clone();

 myfunction(secondhome);

 Also to note you will find the smart-pointer clunky and you will be
 confused as how to write libraries or make a good API for your
 application. I would like to leave you with one more concept. You
 will find passin RcMyType around cumbersome. To remedy this you
 can learn the pattern of making MyType wrap the Rc so the Rc is
 internal to it. So your API will pass around MyType instead.

 Okay, sorry for such a long mail. I just hope this little tips and
 things can help you instead of making you quit leaving a bitter
 taste for Rust!

 On Sun, Jan 11, 2015 at 7:17 AM, Mayuresh Kathe mayur...@kathe.in

 wrote:
 hello matthieu,

 thanks for responding.

 you mentioned that rust supports some object-oriented concepts.
 may i know which?

 also, deviating a bit off-topic, would a decent grasp of functional
 programming be a pre-requisite to learning rust?

 thanks,

 ~mayuresh

 On 2015-01-11 17:21, Matthieu Monrocq wrote:
 Hello Mayuresh,

 The problem with your question is dual:

  - OO itself is a fairly overloaded term, and it is unclear what
 definition you use for it: Alan Kay's original? The presence of
 inheritance? ...
  - Just because a language supports OO concepts does not mean that
 it
 ONLY supports OO concepts, many languages are multi-paradigms and
 can
 be used for procedural programming, object-oriented programming (in
 a
 loose sense given the loose definition in practice), generic
 programming, functional programming, ...

 Rust happens to be a multi-paradigms language. It supports some,
 but
 not all, object-oriented concepts, but also thrives with free
 functions and generic functions and supports functional programming
 expressiveness (but not purity concepts).

 I would also note that I have C striving to achieve some OO
 concepts
 (opaque pointers for encapsulation, virtual-dispatch through
 manually
 written virtual-tables, ...), some even in C you cannot necessarily
 avoid the OO paradigm, depending on the libraries you use.

 Is Rust a good language for you? Maybe!

 The only way for you to know is to give it a spin.

 Have a nice day.

 -- Matthieu

 On Sun, Jan 11, 2015 at 2:59 AM, Mayuresh Kathe mayur...@kathe.in
 wrote:

 hello,

 i am an absolute newbie to rust.

 is rust an object-oriented programming language?

 i ask because i detest 'oo', and am looking for something better
 than
 c.

 thanks,

 ~mayuresh


  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev [1]



 Links:
 --
 [1] https://mail.mozilla.org/listinfo/rust-dev

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] is rust an 'oopl'?

2015-01-11 Thread Matthieu Monrocq

Hello Mayuresh,

The problem with your question is dual:

 - OO itself is a fairly overloaded term, and it is unclear what definition
you use for it: Alan Kay's original? The presence of inheritance? ...
 - Just because a language supports OO concepts does not mean that it ONLY
supports OO concepts, many languages are multi-paradigms and can be used
for procedural programming, object-oriented programming (in a loose sense
given the loose definition in practice), generic programming, functional
programming, ...

Rust happens to be a multi-paradigms language. It supports some, but not
all, object-oriented concepts, but also thrives with free functions and
generic functions and supports functional programming expressiveness (but
not purity concepts).

I would also note that I have C striving to achieve some OO concepts
(opaque pointers for encapsulation, virtual-dispatch through manually
written virtual-tables, ...), some even in C you cannot necessarily avoid
the OO paradigm, depending on the libraries you use.

Is Rust a good language for you? Maybe!

The only way for you to know is to give it a spin.

Have a nice day.

-- Matthieu

On Sun, Jan 11, 2015 at 2:59 AM, Mayuresh Kathe mayur...@kathe.in wrote:

 hello,

 i am an absolute newbie to rust.

 is rust an object-oriented programming language?

 i ask because i detest 'oo', and am looking for something better than
 c.

 thanks,

 ~mayuresh

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] A question about implementation of str

2014-12-03 Thread Matthieu Monrocq

str is simply a pair (length, pointer).

The reason that even for a literal the length is packed as an argument is
that str does not ONLY work for literals (complete type 'static str) but
for any slice of characters, such as those produced by String::as_slice()
in which case the lifetime is different (only live as long as the
particular String instance) and the length is not necessarily known at
compile-time.

On Wed, Dec 3, 2014 at 6:34 PM, C K Kashyap ckkash...@gmail.com wrote:

 Hi,
 I am stuck in my kernel development where I find that I am not able to
 iterate over a str. The code is here -
 https://github.com/ckkashyap/unix/blob/master/kernel/uart.rs in the
 function uart_putc I find that the for-loop loops the right number of
 times but it does not print the right character. To me it appears to be a
 linking problem with my kernel. However, to debug this issue I wanted to
 get a better understanding of what happens when we iterate over str. I was
 surprised to see that the length of the string literal that is determined
 at compile time is being sent as an argument.

 I'd appreciate any insights into how I can debug this.

 Regards,
 Kashyap

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Overflow when benchmarking

2014-11-28 Thread Matthieu Monrocq

Hello,

To be clear: there is no such thing as stack/heap in C and C++, there are
automatic variable and dynamically allocated variables, the former having
their lifetime known statically and the latter not...

Whether a particular compiler chooses to use the stack or heap for either
is its free choice, as long as it maintains the as-if rule.

In this case, I have never heard of automatically moving an automatic
variable to the heap, however LLVM routinely uses the stack for dynamically
allocated variables if it can prove their lifetime (probably restricted to
fixed-size variables below a certain threshold).

Regarding Variable Length Arrays (C99), they are not valid in C++, and
yes they are traditionally implemented using alloc, for better or worse.

-- Matthieu

On Fri, Nov 28, 2014 at 4:40 AM, Manish Goregaokar manishsm...@gmail.com
wrote:

 C++/C has a lot of features which seem tantalizing at first; but end up
 being against the point of a systems language.

 Putting large arrays on the heap (not sure if C++ does this, but it sounds
 like something C++ would do) is one -- there are plenty of cases where you
 explicitly want stack-based arrays in systems programming.

 Another is the alloca-like behavior of dynamically sized stack-based
 arrays (just learned about this recently).

 You always want to be clear of what the compiler is doing. Such
 optimizations can easily be implemented as a library :)

 -Manish Goregaokar

 On Thu, Nov 27, 2014 at 10:20 PM, Diggory Hardy li...@dhardy.name wrote:

  Shouldn't the compiler automatically put large arrays on the heap? I
 thought this was a common thing to do beyond a certain memory size.


 On Thursday 27 November 2014 04:28:03 Steven Fackler wrote:

 The `nums` array is allocated on the stack and is 8 MB (assuming you're
 on a 64 bit platform).

 On Wed Nov 26 2014 at 8:23:08 PM Ben Wilson benwilson...@gmail.com
 wrote:

 Hey folks, I've started writing some rust code lately and run into weird
 behavior when benchmarking. When running


 https://gist.github.com/benwilson512/56f84d4625f11feb

 #[bench]

 fn test_overflow(b: mut Bencher) {

   let nums = [0i, ..100];

   b.iter(|| {

 let mut x = 0i;

 for i in range(0, nums.len()) {

   x = nums[i];

 }

   });

 }


  I get task 'main' has overflowed its stack pretty much immediately when 
 running cargo bench. Ordinarily I'd expect to see that error when doing 
 recursion, but I can't quite figure out why it's showing up here. What am I 
 missing?


 Thanks!


 - Ben

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev




 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Why there's this asymmetry in defining a generic type/function/method and calling it?

2014-11-19 Thread Matthieu Monrocq

On Wed, Nov 19, 2014 at 2:42 PM, Daniel Trstenjak 
daniel.trsten...@gmail.com wrote:


 Hi Paul,

 On Tue, Nov 18, 2014 at 03:31:17PM -0500, Paul Stansifer wrote:
  It's not so much the speed of the parser that is the matter, but the
 fragility
  of the grammar. The less lookahead that's required, the more likely it
 is that
  parser error messages will make sense, and the less likely that a future
 change
  to Rust's syntax will introduce an ambiguity.

 Ok, that's absolutely reasonable.

 I'm wondering, if it could get distinct by enforcing some properties
 which are already compile warnings, that types should always start
 with an upper case and functions/methods with a lower case.


Note, the syntax also applies to functions; ie if you have `fn powT:
Num(T n, uint e) - T` then to qualify `T` you can use `pow::int(123,
4)`.

Therefore using case would not solve the issue (not completely, at least).



 let foo = (HashMapFoo, Bar::new());

 But then 'HashMap' could still e.g. be an enum value instead of
 a type, but currently you certainly also need some kind of context
 to distinguish cases like e.g. 'some(x)' and 'Some(x)'.


 Somehow I think, that's a very good idea to enforce these
 properties, regardless of the issue here.

 If you've read code where everything starts with a lower case or
 upper case (even variables!), then you can really see the value
 of using the case to distinguish types/functions/methods.


 Greetings,
 Daniel
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] On the use of unsafe

2014-09-22 Thread Matthieu Monrocq

It's completely unnecessary actually.

If a method requires a XSS-safe string, then it should take the
XssSafeString parameter, which would implement DerefString and would be
built from a String by a method performing the necessary escaping.

If a method requires a SQL-safe string... ah no, don't do that, use
bind-parameters and you are guaranteed to be sql-injection safe.

In each case, the attributes so defined can be perfectly replaced with
appropriate types... so why not use types ?



On Mon, Sep 22, 2014 at 4:50 AM, Manish Goregaokar manishsm...@gmail.com
wrote:


 That's not how Rust defines `unsafe`. It's open to misuse, and the
 compiler will happily point out that it's not being used correctly via
 the unnecessary unsafe lint.


 If that's the case, do you think there's some worth in allowing the
 programmer to define arbitrary generic safety types?

 E.g have an `#[unsafe(strings)]` attribute that can be placed on methods
 that break String guarantees (and placed on blocks where we wish to allow
 such calls). `#[unsafe(sql)]` for SQL methods that are injection-prone. If
 something like this slide
 https://www.youtube.com/watch?feature=player_detailpagev=jVoFws7rp88#t=1664
 was ever implemented, methods that allow unsafe (XSS-prone) vulnerabilities
 can have `#[unsafe(xss)]`.

 Rust does a bunch of compile time checking to achieve memory safety. It
 also provides a syntax extension/lint system that allows for programmers to
 define further compile time checks, which open up the gate for many more
 possible safety guarantees (instead of relying on separate static analysis
 tools), and not just memory safety. Perhaps we should start recognizing and
 leveraging that ability more :)

 -Manish Goregaokar


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Rust BigInt

2014-09-19 Thread Matthieu Monrocq

On Fri, Sep 19, 2014 at 6:13 AM, Daniel Micay danielmi...@gmail.com wrote:

 On 19/09/14 12:09 AM, Lee Wei Yen wrote:
  Hi all!
 
  I’ve just started learning to use Rust now, and so far it’s been
  everything I wanted in a language.
 
  I saw from the docs that the num::bigint::BigInt type has been
  deprecated - does it have a replacement?
 
  --
  Lee Wei Yen

 It was moved to https://github.com/rust-lang/num

 There's also https://github.com/thestinger/rust-gmp which binds to GMP.

 GMP has better time complexity for the operations, significantly faster
 constant factors (10-20x for some operations) and more functionality.

 It also doesn't have lots of showstopper bugs since it's a mature library.


Disclaimer, for the unwary, GMP is a GPL library; so using it implies
complying with the GPL license.



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Dynamic format template

2014-08-25 Thread Matthieu Monrocq

While not possible today, there is actually nothing preventing you to
create a safe alternative (or even improving format so it works in this
way).

In a sense, a formatting function has two set of inputs:

- the format itself, from which you extract a set of constraints (expected
type-signature)
- the arguments to format, which can be seen as a single tuple (provided
type-signature)

And as long as you can ensure at compile time that you never attempt to
apply an expected type-signature to an incompatible provided
type-signature, then you are safe.


I would suppose that as far as having runtime formats go, you would need to
introduce an intermediary step: the expected type-signature.

You could have a Format object, generic over the expected type-signature,
and a new constructor method taking a str and returning an
OptionFormat


Now, you have two phases:

 - the new constructor checks, at runtime, that the specified format
matches the expected type-signature
 - the compiler checks, at compile-time, that the provided type-signature
(arguments) match the expected type-signature (or it can be coerced to)

It might require variadic generics and subtle massaging of the type system,
however I do think it would be possible.


It might not be the best way to attack the issue though.



On Mon, Aug 25, 2014 at 1:33 AM, Kevin Ballard ke...@sb.org wrote:

 It’s technically possible, but horribly unsafe. The only thing that makes
 it safe to do normally is the syntax extension that implements `format!()`
 ensures all the types match. If you really think you need this, you can
 look at the implementation of core::fmt. But it’s certainly not appropriate
 for localization, or template engines.

 -Kevin Ballard

  On Aug 24, 2014, at 2:48 PM, Vadim Chugunov vadi...@gmail.com wrote:
 
  Hi,
  Is there any way to make Rust's fmt module to consume format template
 specified at runtime?
  This might be useful for localization of format!'ed strings, or, if one
 wants to use format! as a rudimentary template engine.
  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Integer overflow, round -2147483648

2014-06-22 Thread Matthieu Monrocq

I am not a fan of having wrap-around and non-wrap-around types, because
whether you use wrap-around arithmetic or not is, in the end, an
implementation detail, and having to switch types left and right whenever
going from one mode to the other is going to be a lot of boilerplate.

Instead, why not take the same road than swift and map +, -, * and / to
non-wrap-around operators and declare new (more verbose) operators for the
rare case where performance matters or wrap-around is the right semantics ?

Even though Rust is a performance conscious language (since it aims at
displacing C and C++), the 80/20 rule still applies and most of Rust code
should not require absolute speed; so let's make it convenient to write
safe code and prevent newcomers from shooting themselves in the foot by
providing safety by default, and for those who profiled their applications
or are writing hashing algorithms *also* provide the necessary escape
hatches.

This way we can have our cake and eat it too... or am I missing something ?

-- Matthieu



On Sun, Jun 22, 2014 at 5:45 AM, comex com...@gmail.com wrote:

 On Sat, Jun 21, 2014 at 7:10 PM, Daniel Micay danielmi...@gmail.com
 wrote:
  Er... since when?  Many single-byte opcodes in x86-64 corresponding to
  deprecated x86 instructions are currently undefined.
 
  http://ref.x86asm.net/coder64.html
 
  I don't see enough gaps here for the necessary instructions.

 You can see a significant number of invalid one-byte entries, 06, 07,
 0e, 1e, 1f, etc.  The simplest addition would just be to resurrect
 INTO and make it efficient - assuming signed 64 and 32 bit integers
 are good enough for most use cases.  Alternatively, it could be two
 one-byte instructions to add an unsigned version (perhaps a waste of
 precious slots) or a two-byte instruction which could perhaps allow
 trapping on any condition.  Am I missing something?
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Rust's documentation is about to drastically improve

2014-06-18 Thread Matthieu Monrocq

On Wed, Jun 18, 2014 at 6:22 PM, Steve Klabnik st...@steveklabnik.com
wrote:

  In case of trivial entities

 The problem with this is what's trivial to you isn't trivial to someone
 else.

  think about the amount of update this may make necessary in case Rust
 language syntax changes.

 Literally my job. ;) Luckily, the syntax has been pretty stable
 lately, and most changes have just been mechanical.


If you could, it would be awesome to invest in a check that the provided
examples compile with the current release of the compiler (possibly as part
of the documentation generation).

This not only guarantees that the examples are up-to-date, but also helps
in locating out-dated examples.

On the other hand, this may require more boilerplate to get self-contained
examples (that can actually be compiled), so YMMV.

-- Matthieu


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] 7 high priority Rust libraries that need to be written

2014-06-10 Thread Matthieu Monrocq

Could there be a risk in using JSR310 as a basis seeing the recent
judgement of the Federal Circuit Court that judged that APIs were
copyrightable (in the Google vs Oracle fight over the Java API) ?

-- Matthieu


On Sat, Jun 7, 2014 at 6:01 PM, Bardur Arantsson s...@scientician.net
wrote:

 On 2014-06-05 01:01, Brian Anderson wrote:
  # Date/Time (https://github.com/mozilla/rust/issues/14657)
 
  Our time crate is very minimal, and the API looks dated. This is a hard
  problem and JodaTime seems to be well regarded so let's just copy it.

 JSR310 has already been mentioned in the thread, but I didn't see anyone
 mentioning that it was accepted into the (relatively) recently finalized
 JDK8:

http://docs.oracle.com/javase/8/docs/api/java/time/package-summary.html

 The important thing to note is basically that it was simplified quite a
 lot relative to JodaTime, in particular by removing non-Gregorian
 chronologies.

 Regards,

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Patterns that'll never match

2014-06-01 Thread Matthieu Monrocq

On Sun, Jun 1, 2014 at 1:04 PM, Tommi rusty.ga...@icloud.com wrote:

 On 2014-06-01, at 13:48, Gábor Lehel glaebho...@gmail.com wrote:

 It would be possible in theory to teach the compiler about e.g. the
 comparison operators on built-in integral types, which don't involve any
 user code. It would only be appropriate as a warning rather than an error
 due to the inherent incompleteness of the analysis and the arbitrariness of
 what things to include in it. No opinion about whether it would be worth
 doing.


 Perhaps this kind of thing would be better suited for a separate tool that
 could (contrary to a compiler) run this and other kinds of heuristics
 without having to worry about blowing up  compilation times.


This is typically the domain of either static analysis or runtime
instrumentation (branch coverage tools) in the arbitrary case, indeed.

-- Matthieu



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] A better type system

2014-06-01 Thread Matthieu Monrocq

FYI: I did a RFC for separating mut and only some times ago:
https://github.com/rust-lang/rfcs/pull/78#

I invite the interested readers to check it out and read the comments
(notably those by thestinger, aka Daniel Micay on this list).

For now, my understanding was that proposals on the topic were suspended
until the dev team manages to clear its plate of several big projects (such
as DST), especially as thestinger had a proposal to change the way lambda
captures are modeled so it no longer requires a uniq (only accessible to
the compiler).

-- Matthieu



On Sun, Jun 1, 2014 at 2:32 AM, Patrick Walton pwal...@mozilla.com wrote:

 Yes, you could eliminate (c) by prohibiting taking references to the
 inside of sum types (really, any existential type). This is what Cyclone
 did. For (e) I'm thinking of sum types in which the two variants have
 different sizes (although maybe that doesn't work).

 We'd basically have to bring back the old mut as a separate type of
 pointer to make it work. Note that Niko was considering a system like this
 in older blog posts pre-INHTWAMA. (Search for restrict pointers on his
 blog.)

 Patrick

 On May 31, 2014 5:26:39 PM PDT, Cameron Zwarich zwar...@mozilla.com
 wrote:

 FWIW, I think you could eliminate (c) by prohibiting mutation of sum
 types. What case are you thinking of for (e)?

 For (d), this would probably have to be distinguished from the current
 mut somehow, to allow for truly unique access paths to sum types or shared
 data, so you could preserve any aliasing optimizations for the current
 mut. Of course, more functions might take the less restrictive version,
 eliminating the optimization that way.

 Not that I think that this is a great idea; I’m just wondering whether
 there are any caveats that have escaped my mental model of the borrow
 checker.

 Cameron

 On May 31, 2014, at 5:01 PM, Patrick Walton pwal...@mozilla.com wrote:

 I assume what you're trying to say is that we should allow multiple
 mutable references to pointer-free data. (Note that, as Huon pointed out,
 this is not the same thing as the Copy bound.)

 That is potentially plausible, but (a) it adds more complexity to the
 borrow checker; (b) it's a fairly narrow use case, since it'd only be safe
 for pointer-free data; (c) it admits casts like 3u8 - bool, casts to
 out-of-range enum values, denormal floats, and the like, all of which would
 have various annoying consequences; (d) it complicates or defeats
 optimizations based on pointer aliasing of mut; (e) it allows
 uninitialized data to be read, introducing undefined behavior into the
 language. I don't think it's worth it.

 Patrick

 On May 31, 2014 4:42:10 PM PDT, Tommi rusty.ga...@icloud.com wrote:

 On 2014-06-01, at 1:02, Patrick Walton pcwal...@mozilla.com wrote:

fn my_transmuteT:Clone,U(value: T, other: U) - U {
let mut x = Left(other);
let y = match x {
Left(ref mut y) = y,
Right(_) = fail!()
};
*x = Right(value);
(*y).clone()
}


 If `U` implements `Copy`, then I don't see a (memory-safety) issue here.
 And if `U` doesn't implement `Copy`, then it's same situation as it was in
 the earlier example given by Matthieu, where there was an assignment to an
 `OptionBoxstr` variable while a different reference pointing to that
 variable existed. The compiler shouldn't allow that assignment just as in
 your example the compiler shouldn't allow the assignment `x =
 Right(value);` (after a separate reference pointing to the contents of `x`
 has been created) if `U` is not a `Copy` type.

 But, like I said in an earlier post, even though I don't see this
 (transmuting a `Copy` type in safe code) as a memory-safety issue, it is a
 code correctness issue. So it's a compromise between preventing logic bugs
 (in safe code) and the convenience of more liberal mutation.


 --
 Sent from my Android phone with K-9 Mail. Please excuse my brevity.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev



 --
 Sent from my Android phone with K-9 Mail. Please excuse my brevity.

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] A better type system

2014-05-31 Thread Matthieu Monrocq

Iterator invalidation is a sweet example, which strikes at the heart of C++
developer (those who never ran into it, please raise your hands).


However it is just an example, anytime you have aliasing + mutability, you
may have either memory issues or logical bugs.

Another example of memory issue:

foo(left: OptionBoxstr, right: mut OptionBoxstr) {
let ptr: str = *left.unwrap();

right = None;

match ptr.len() { // Watch out! if left and right alias, then ptr
is no a dangling reference!
// ...
}
}

The issue can actually occur in other ways: replace Boxstr by enum Point
{ Integral(int, int), Floating(f64, f64) } and you could manage to write
integral into floats or vice-versa, which is memory-corruption, not
segmentation fault.


The Rust type system allows, at the moment, to ensure that you never have
both aliasing and mutability. Mostly at compile-time, and at run-time
through a couple unsafe hatches (Cell, RefCell, Mutex, ...).

I admit it is jarring, and constraining. However the guarantee you get in
exchange (memory-safe  thread-safe) is extremely important.


 I'm writing this from a phone and I haven't thought of this issue very
thoroughly.

Well, think a bit more. If you manage to produce a more refined
type-system, I'd love to hear about it. In the mean time though, I advise
caution in criticizing the existing: it has the incredible advantage of
working.



On Sat, May 31, 2014 at 7:54 PM, Alex Crichton acrich...@mozilla.com
wrote:

  Sorry for the brevity, I'm writing this from a phone and I haven't
 thought of this issue very thoroughly.

 You appear to dislike one of the most fundamental features of Rust, so
 I would encourage you to think through ideas such as this before
 hastily posting to the mailing list.

 The current iteration of Rust has had a great deal of thought and
 design poured into it, as well as having at least thousands of man
 hours of effort being put behind it. Casually stating, with little
 prior thought, that large chunks of this effort are flatly wrong is
 disrespectful to those who have put so much time and effort into the
 project.

 We always welcome and encourage thoughtful reconsiderations of the
 design decisions of Rust, but these must be performed in a
 constructive and well-thought-out manner. There have been many times
 in the past where the design decisions of Rust have been reversed or
 redone, but these were always accompanied with a large amount of
 research to fuel the changes.

 If you have concrete suggestions, we have an RFC process in place for
 proposing new changes to the language while gathering feedback at the
 same time.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] A few random questions

2014-05-30 Thread Matthieu Monrocq

On Fri, May 30, 2014 at 2:01 AM, Oleg Eterevsky o...@eterevsky.com wrote:

  Since browsers were brought up, here is the Google C++ style guide on
 exceptions:
 
 http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Exceptions

 As someone who works for Google, I can attest, that exceptions are
 encouraged in Google style guides for Python and Java and the main
 reason they are forbidden in C++ is their memory safety. Google has a
 big amount of pre-exceptions C++ code, and it will break in unexpected
 places if exceptions are allowed.


Yes, which is a common issue. Exception usage requires exception-safe code.
But then, exception-safe code is also code resilient in the face of
introducing other return paths so it's just overall better whether in the
presence of exceptions or not...


 Go is a different story. It deliberately refuses to support exceptions
 even though it has GC and hence has no problems with exception memory
 safety whatsoever. The lack of exception might be one of the main
 reasons (if not the main reason), why Go is not so popular even within
 Google.


Personally, I've found exceptions too unwieldy. As I mentioned, the issue
of catching an exception is now, how do I recover ?.

Note that Rust and Go do have exceptions (and unwinding), it's just that
you have to create a dedicated task instead of a try/catch block. Indeed,
it's more verbose (which is mostly a matter of libraries/macros) and it's
also less efficient (which could be addressed, though at compiler level);
however it's just plain safer: now that shared-state/updates to the
external world are explicit, you can much more easily evaluate what it
takes to recover.



 On Thu, May 29, 2014 at 4:39 PM, comex com...@gmail.com wrote:
  On Thu, May 29, 2014 at 7:10 PM, Oleg Eterevsky o...@eterevsky.com
 wrote:
  The projects in C++ that forbid exceptions are doing so not because of
  some prejudice, but because exceptions in C++ are unsafe. In Java
  standard library exceptions are ubiquitous.
 
  If you mean checked exceptions, I hear that they're quite unpopular,
  although I don't use Java.
 
  Since browsers were brought up, here is the Google C++ style guide on
  exceptions:
 
 
 http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Exceptions
 
  It bans them due to a variety of downsides which would only be
  partially addressed by checked-exception-like safety systems.  I think
  Google Java code does use exceptions, but that's language culture for
  you.
 
  As a related data point, Go eschews exceptions entirely due to prejudice:
 
  http://golang.org/doc/faq#exceptions
 
  Not that I agree with most of Go's design decisions... still, I think
  these examples are enough to demonstrate that there are legitimate
  reasons to prefer a language designed without exceptions.
 
  I think it may be good for you to get more experience with Rust,
  although as I mentioned, I also lack experience.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] How to find Unicode string length in rustlang

2014-05-30 Thread Matthieu Monrocq

Except that in C++ std::basic_string::size and std::basic_string:length are
synonymous (both return the number of CharTs, which in std::string is also
the number of bytes).

Thus I am unsure whether this would end up helping C++ developers. Might
help others though.


On Fri, May 30, 2014 at 2:12 PM, Nathan Myers n...@cantrip.org wrote:

 A good name would be size().  That would avoid any confusion over various
 length definitions, and just indicate how much address space it occupies.

 Nathan Myers


 On May 29, 2014 8:11:47 PM Palmer Cox palmer...@gmail.com wrote:

  Thinking about it more, units() is a bad name. I think a renaming could
 make sense, but only if something better than len() can be found.

 -Palmer Cox


 On Thu, May 29, 2014 at 10:55 PM, Palmer Cox palmer...@gmail.com wrote:

  What about renaming len() to units()?
 
  I don't see len() as a problem, but maybe as a potential source of
  confusion. I also strongly believe that no one reads documentation if
 they
  *think* they understand what the code is doing. Different people will
 see
  len(), assume that it does whatever they want to do at the moment, and
 for
  a significant portion of strings that they encounter it will seem like
  their interpretation, whatever it is, is correct. So, why not rename
 len()
  to something like units()? Its more explicit with the value that its
  actually producing than len() and its not all that much longer to type.
 As
  stated, exactly what a string is varies greatly between languages, so, I
  don't think that lacking a function named len() is bad. Granted, I would
  expect that many people expect that a string will have method named
 len()
  (or length()) and when they don't find one, they will go to the
  documentation and find units(). I think this is a good thing since the
  documentation can then explain exactly what it does.
 
  I much prefer len() to byte_len(), though. byte_len() seems like a bit
  much to type and it seems like all the other methods on strings should
 then
  be renamed with the byte_ prefix which seems unpleasant.
 
  -Palmer Cox
 
 
  On Thu, May 29, 2014 at 3:39 AM, Masklinn maskl...@masklinn.net
 wrote:
 
 
  On 2014-05-29, at 08:37 , Aravinda VK hallimanearav...@gmail.com
 wrote:
 
   I think returning length of string in bytes is just fine. Since I
  didn't know about the availability of char_len in rust caused this
  confusion.
  
   python 2.7 - Returns length of string in bytes, Python 3 returns
 number
  of codepoints.
 
  Nope, depends on the string type *and* on compilation options.
 
  * Python 2's `str` and Python 3's `bytes` are byte sequences, their
   len() returns their byte counts.
  * Python 2's `unicode` and Python 3's `str` before 3.3 returns a code
   units count which may be UCS2 or UCS4 (depending whether the
   interpreter was compiled with `—enable-unicode=ucs2` — the default —
   or `—enable-unicode=ucs4`. Only the latter case is a true code points
   count.
  * Python 3.3's `str` switched to the Flexible String Representation,
   the build-time option disappeared and len() always returns the number
   of codepoints.
 
  Note that in no case to len() operations take normalisation or visual
  composition in account.
 
   JS returns number of codepoints.
 
  JS returns the number of UCS2 code units, which is twice the number of
  code points for those in astral planes.
  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev
 
 
 




 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] EnumSet, CLike and enums

2014-05-30 Thread Matthieu Monrocq

I advise you to check the tests accompanying EnumSet (in the source code):
http://static.rust-lang.org/doc/master/src/collections/home/rustbuild/src/rust-buildbot/slave/nightly-linux/build/src/libcollections/enum_set.rs.html#144-158

They show a simple implementation:

impl CLike for Foo {
fn to_uint(self) - uint {
*self as uint
}

fn from_uint(v: uint) - Foo {
unsafe { mem::transmute(v) }
}
}

which uses transmute to avoid that manual maintenance.

Note though that in general if you wanted to add new enum values while
maintaining it sorted alphabetically and still be backward-compatible you
would need to handle the values manually.


On Fri, May 30, 2014 at 8:41 PM, Igor Bukanov i...@mir2.org wrote:

 Is it possible to somehow automatically derive
 collections::enum_set::CLike for a enum? The idea of writing

 impl CLike for MyEnum {
 fn to_uint(self) - uint {
 return *self as uint;
 }

 fn from_uint(n: uint) - Flag {
 match n {
 0 = EnumConst1,
 ...
 _ = fail!({} does not match any enum case, n)
 }
 }
 }

 just to get a type safe bit set EnumSetMyEnum is rather discouraging.

 On a related note I see that EnumSet never checks that CLike::to_int
 result stays below the word size. Is it a bug?
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] cannot borrow `st` as mutable more than once at a time

2014-05-29 Thread Matthieu Monrocq

Does this mean that the desugaring of the for loop is incorrect ? Or at
least, could be improved.


On Thu, May 29, 2014 at 8:22 PM, Vladimir Matveev dpx.infin...@gmail.com
wrote:

 Hi, Christophe,

 Won't wrapping the first `for` loop into curly braces help? I suspect
 this happens because of `for` loop desugaring, which kind of leaves
 the iterator created by `execute_query()` in scope (not really, but
 only for borrow checker).

 2014-05-29 19:38 GMT+04:00 Christophe Pedretti 
 christophe.pedre...@gmail.com:
  Hello all,
 
  i know that this issue is already covered by issues #9113 #6393 and #9113
  but actually i have no solution. My code is a library for accessing
  databases. In my example, the database represented by db containes a
 table t
  with columns i:integer, f:float, t:text, b:blob.
 
  Everything works fine except the following code used to test my library
 
  match db.prepare_statement(SELECT i,f,t,b FROM t where t like ?;) {
 
  Ok(mut st) = {
 
  st.set_string(1, %o%);
 
  for i in st.execute_query() {
 
  match i {
 
  Ok(s)  = println!({}:{}:{}:{}, s.get_long(0), s.get_double(1),
  s.get_string(2), s.get_blob(3) ),
 
  Err(e) = match e.detail {
 
  Some(s) = println!({}, s),
 
  None = ()
 
  }
 
  }
 
  }
 
  st.set_string(1, %e%);  - PROBLEM HERE
  for i in st.execute_query() { - PROBLEM HERE
 
  ...
  }
 
  },
  Err(e) = match e.detail {
 
  None = (),
 
  Some(s) = println!({}, s)
 
  }
 
  }
 
  The compilation error says
 
  test-db.rs:71:8: 71:10 error: cannot borrow `st` as mutable more than
 once
  at a time
  test-db.rs:71
  st.set_string(1, %e%);
  ^~
  test-db.rs:61:17: 61:19 note: previous borrow of `st` occurs here; the
  mutable borrow prevents subse
  quent moves, borrows, or modification of `st` until the borrow ends
  test-db.rs:61   for i in
  st.execute_query() {
 
  ^~
  test-db.rs:88:7: 88:7 note: previous borrow ends here
  test-db.rs:58   match
  db.prepare_statement(SELECT i,f,t,b FROM t wh
  ere t like ?;) {
  ...
  test-db.rs:88   }
 ^
  error: aborting due to previous error
 
  do we have a solution for #6393 ?
 
  Thanks
 
  --
  Christophe
  http://chris-pe.github.io/Rustic/
 
 
  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev
 
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Something like generics, but with ints

2014-05-25 Thread Matthieu Monrocq

It's been discussed, but there is still discussion on the best way to
achieve this.

At the moment, you should be able to get around it using Peano numbers [1]:

struct Zero;

struct SuccT;

struct MatrixT, M, N {
data: VecT,
}

fn cofactorT, M, N(
m: MatrixT, SuccM, SuccN,
row: int,
col: int
) - MatrixT, M, N
{
Matrix::T, M, N{ data: vec!() }
}


Of course, I would dread seeing the error message should you need more than
a couple rows/columns...

[1] http://www.haskell.org/haskellwiki/Peano_numbers


On Sun, May 25, 2014 at 7:25 PM, Isak Andersson cont...@bitpuffin.comwrote:

 Hello!

 I was asking in IRC if something like this:

 fn cofactor(m: MatrixT, R, C, row, col: int) - MatrixT, R-1, C-1 {...}

 was possible. I quickly got the response that generics doesn't work with
 integers. So my question is, is there anyway to achieve something similar?

 Or would it be possible in the future to do generic instantiation based on
 more
 than just types.

 Thanks!

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Qt5 Rust bindings and general C++ to Rust bindings feedback

2014-05-24 Thread Matthieu Monrocq

On Sat, May 24, 2014 at 9:06 AM, Zoltán Tóth zo1...@gmail.com wrote:

 Alexander, your option 2 could be done automatically. By appending
 postfixes to the overloaded name depending on the parameter types.
 Increasing the number of letters used till the ambiguity is fully resolved.

 What do you think?


 fillRect_RF_B ( const QRectF  rectangle, const QBrush  brush )
 fillRect_I_I_I_I_BS ( int x, int y, int width, int height, Qt::BrushStyle
 style )
 fillRect_Q_BS ( const QRect  rectangle, Qt::BrushStyle style )
 fillRect_RF_BS ( const QRectF  rectangle, Qt::BrushStyle style )
 fillRect_R_B ( const QRect  rectangle, const QBrush  brush )
 fillRect_R_C ( const QRect  rectangle, const QColor  color )
 fillRect_RF_C ( const QRectF  rectangle, const QColor  color )
 fillRect_I_I_I_I_B ( int x, int y, int width, int height, const QBrush 
 brush )
 fillRect_I_I_I_I_C ( int x, int y, int width, int height, const QColor 
 color )
 fillRect_I_I_I_I_GC ( int x, int y, int width, int height, Qt::GlobalColor
 color )
 fillRect_R_GC ( const QRect  rectangle, Qt::GlobalColor color )
 fillRect_RF_GC ( const QRectF  rectangle, Qt::GlobalColor color )


 I believe this alternative was considered in the original blog post
Alexander wrote: this is, in essence, mangling. It makes for ugly function
names, although the prefix helps in locating them I guess.


Before we talk about generation though, I would start about investigating
where those overloads come from.

First, there are two different objects being manipulated here:

+ QRect is a rectangle with integral coordinates
+ QRectF is a rectangle with floating point coordinates


Second, a QRect may already be build from (int* x*, int* y*, int* width*,
int* height*); thus all overloads taking 4 hints instead of a QRect are
pretty useless in a sense.

Third, in a similar vein, QBrush can be build from (Qt::BrushStyle),
(Qt::GlobalColor) or (QColor const). So once again those overloads are
pretty useless.


This leaves us with:

+ fillRect(QRect const, QBrush const)
+ fillRect(QRectF const, QBrush const)

Yep, that's it. Of all those inconsistent overloads (missing 4 taking 4
floats, by the way...) only 2 are ever useful. The other 10 can be safely
discarded without impacting the expressiveness.


Now, of course, the real question is how well a tool could perform this
reduction step. I would note here that the position and names of the
coordinate arguments of fillRect is exactly that of those to QRect;
maybe a simple exhaustive search would thus suffice (though it does require
semantic understanding of what a constructor and default arguments are).

It would be interesting checking how many overloads remain *after* this
reduction step. Here we got a factor of 6 already (should have been 8 if
the interface had been complete).

It would also be interesting checking if the distinction int/float often
surfaces, there might be an opportunity here.


-- Matthieu


Alexander Tsvyashchenko wrote:

  So far I can imagine several possible answers:

1. We don't care, your legacy C++ libraries are bad and you should
feel bad! - I think this stance would be bad for Rust and would hinder 
 its
adoption, but if that's the ultimate answer - I'd personally prefer it 
 said
loud and clear, so that at least nobody has any illusions.

2. Define  maintain the mapping between C++ and Rust function
names (I assume this is what you're alluding to with define meaningful
unique function names above?) While this might be possible for smaller
libraries, this is out of the question for large libraries like Qt5 - at
least I won't create and maintain this mapping for sure, and I doubt 
 others
will: just looking at the stats from 3 Qt5 libraries (QtCore, QtGui and
QtWidgets) out of ~30 Qt libraries in total, from the 50745 wrapped
methods 9601 were overloads and required renaming.

Besides that, this has a disadvantage of throwing away majority of
the experience people have with particular library and forcing them to
le-learn its API.

On top of that, not for every overload it's easy to come up with
short, meaningful, memorable and distinctive names - you can try that
exercise for http://qt-project.org/doc/qt-4.8/qpainter.html#fillRect;-)

3. Come up with some way to allow overloading / default parameters
- possibly with reduced feature set, i.e. if type inference is difficult 
 in
the presence of overloads, as suggested in some overloads discussions
(although not unsolvable, as proven by other languages that allow both 
 type
inference  overloading?), possibly exclude overloads from the type
inference by annotating overloaded methods with special attributes?

4. Possibly some other options I'm missing?

  --
 Good luck! Alexander


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] New on Rust/Servo

2014-05-17 Thread Matthieu Monrocq

And let's not forget the ever useful
https://github.com/bvssvni/rust-emptyto get a pre-made Makefile for
rust.


On Sat, May 17, 2014 at 1:11 PM, Artella Coding 
artella.cod...@googlemail.com wrote:

 http://tomlee.co/2014/04/03/a-more-detailed-tour-of-the-rust-compiler/



 On Sat, May 17, 2014 at 12:57 AM, Ricardo Brandão 
 rbrandao...@gmail.comwrote:

 Hi ALL,

 I'd like to introduce my self. I'm a Computer Engineer and I've worked
 with Embedded Computer for 12 years before work with IT Management for 10
 years.

 Now I became a Mozillian, studying Firefox OS, Gonk and Gecko and very
 excited to come back to technical world.

 Last week I attended a lecture on FISL (a Free Software Forum) in Brazil
 about Rust and Servo. From Bruno Abinader

 I'm very interested on these project and I'd like to join it.

 I have a good experience with C and Assembly, but not exactly with
 Unix-like platform. I was used to program directly to the board. I've used
 ZWorld Boards (nowadays ZWorld became Digi).

 But I tried to see some Easy bugs (on rust and servo repos) to at least
 understand, but I'm confused.

 Could you give me some step-by-step, how begin the study of project,
 which documents to read, etc? Remember I'm not an expert on Makefile and C
 for Unix-like platforms. Well, I already worked with but in small projects.

 Thanks in advance!

 --
 Ricardo Brandão
 http://www.programonauta.com.br

 __@
 ._  \ _
 (_) /  (_)

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] How to implement a singleton ?

2014-05-15 Thread Matthieu Monrocq

Hello,

My first instinct would be: don't... but in the name of science...

Have you tried looking at Stack Overflow ? Just googling around I found:
http://stackoverflow.com/questions/19605132/is-it-possible-to-use-global-variables-in-rustwhich
allows you to have a global variable and from there a Singleton seems
easy.

I guess you will need something like MutexOptionType if you want lazy
initialization.

-- Matthieu



On Thu, May 15, 2014 at 9:59 AM, Christophe Pedretti 
christophe.pedre...@gmail.com wrote:

 I am trying to implement a Singleton (an object which instantiate only
 once, successive instantiations returning the object itself).
 Any tutorial for this ? any idea ? example ? best practice ?

 thanks
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] UTF-8 strings versus encoded ropes

2014-05-14 Thread Matthieu Monrocq

On Wed, May 14, 2014 at 2:25 PM, Armin Ronacher armin.ronac...@active-4.com
 wrote:

 Hi,

 On 02/05/2014 00:03, John Downey wrote:

 I have actually always been a fan of how .NET did this. The System.String
 type
 is opinionated in how it is stored internally and does not allow anyone to
 change that (unlike Ruby). The conversion from String to byte[] is done
 using
 explicit conversion methods like:

 Unfortunately the .NET string type does not support UCS4 and as such is a
 nightmare to deal with.  Also because the internal encoding is not UTF-8
 *any* interaction with the outside world (ignoring the win32 api) is going
 through an encode/decode step which can be unnecessary.

 For instance if you would do that on Linux you would decode from utf-8 to
 your internal UCS4 encoding, then encode back to utf-8 on the way back to
 the terminal.  (Aside from that, 32bit for a charpoint is too large as
 unicode does not go in more than 21bit or something.  Useless)


Even keeping whole bytes, 3 bytes (24 bits) is effectively sufficient for
the whole of Unicode. If you don't mind some arithmetic, you could thus use
a backing array of bytes and just recompose the value on output.





 Regards,
 Armin

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Ideas to build Rust projects

2014-04-20 Thread Matthieu Monrocq

I agree that a protable terminal would be sweet, however the terminal and
shell are only half the story: you then need a uniform set of tools behind
the scenes else all your scripts fail.

I would like to take the opportunity to point out Mosh [1] as an existing
(and recent) shell, it might make for a great starting point.


Regarding project ideas, I myself would be very interested in:

- concurrent collections (lists, hash-sets, hash-maps, ...), while I know
this is not in the spirit of CSP sometimes forcing a single queue to access
a collection creates a bottleneck.

- a MPMC queue, at the moment Rust stops at MPSC with its channels and once
again when the load is too important you really need to be able to have
multiple consumers. It could potentially be tied into a WorkerPool
implementation where you can freely administrate the pool size and just
post jobs to the pool, but maybe it could be implemented free-standing.


[1]: http://mosh.mit.edu/



On Sat, Apr 19, 2014 at 4:45 PM, Mahmut Bulut mahmutbul...@gmail.comwrote:

 Terminal portable is good choice for all.
 But I want to say that I started to write util-linux in Rust. Ok there is
 coreutils but we should extend it with perfect system integration. I don't
 have time to complete all of util-linux but if contrbution comes it can
 merge into coreutils.

 You can take a look to Trafo(rewrite of util-linux):

 https://github.com/vertexclique/trafo

 
 Mahmut Bulut

 On 19 Apr 2014, at 12:36, John Mija jon...@proinbox.com wrote:

 Sometimes, developers need ideas or cool projects to be inspired. Here you
 have some ones, please share some more.

 + Implementation of the Raft distributed consensus protocol. It will allow
 to build distributed systems

 Implementations in Go:
https://github.com/goraft/raft
https://github.com/hashicorp/raft

 + Key-value embedded database

 LDBM was built as backend for OpenLDAP, but it is being used in many
 projects. The benchmarks (LevelDB, Kyoto TreeDB, LDBM, BerkeleyDB, SQLite3)
 show that it is faster for read operations, although it's something slower
 than LevelDB for writing.

http://symas.com/mdb/

 There is a pure Go key/value store inspired by the LMDB project:
https://github.com/boltdb/bolt

 + Terminal portable

 Today, to access to a terminal in Unix or windows, you need to provide an
 interface. The great issue is that Unix terminal and Windows console have
 different APIs, so it's very hard to get a portable API for each system.

 Instead, could be created a terminal from scratch handling all in low
 level (without using the Windows API).

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Do I need to watch out for memory fragmentation?

2014-04-15 Thread Matthieu Monrocq

On Mon, Apr 14, 2014 at 10:32 PM, Daniel Micay danielmi...@gmail.comwrote:

 On 14/04/14 12:41 PM, Matthieu Monrocq wrote:
  Memory fragmentation is a potential issue in all languages that not use
  a Compacting GC, so yes.

 It's much less of an issue than people make it out to be on 32-bit, and
 it's a non-issue on 64-bit with a good allocator (jemalloc, tcmalloc).

 Small dynamic memory allocations are tightly packed in arenas, with a
 very low upper bound on fragmentation and metadata overhead. At a
 certain cutoff point, allocations begin to fall through directly to mmap
 instead of using the arenas. On 64-bit, the address space is enormous so
 fragmenting it is only a problem when it comes to causing TLB misses.


By the way, do you have any idea how this is going to pan out on processors
like the Mill CPU where the address space is shared among processes ?



  There are some attenuating circumstances in Rust, notably the fact that
  unless you use a ~ pointer the memory is allocated in a task private
  heap which is entirely recycled at the death of the task, but memory
  fragmentation is always a  potential issue.

 All dynamic memory allocations are currently done with the malloc family
 of functions, whether you use sendable types like `VecT`, `ArcT` and
 `~T` or task-local types like `RcT`. Using a task-local heap for types
 like `RcT` would only serve to *increase* the level of fragmentation
 by splitting it up more.

 For example, jemalloc implements thread-local caching, and then
 distributes the remaining workload across a fixed number of arenas.
 Increasing the level of thread-local caching has a performance benefit
 but by definition increases the level of fragmentation due to more
 unused capacity assigned to specific threads.


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Do I need to watch out for memory fragmentation?

2014-04-14 Thread Matthieu Monrocq

Memory fragmentation is a potential issue in all languages that not use a
Compacting GC, so yes.

There are some attenuating circumstances in Rust, notably the fact that
unless you use a ~ pointer the memory is allocated in a task private heap
which is entirely recycled at the death of the task, but memory
fragmentation is always a  potential issue.


On Mon, Apr 14, 2014 at 6:19 PM, Zach Moazeni zach.li...@gmail.com wrote:

 Hello,

 I'm starting to explore Rust, and as someone who has primarily worked in
 GC'd languages I'm curious if I need to watch out for anything related to
 memory fragmentation. Or if Rust or LLVM is doing something under the
 covers where this is less of an issue.

 Kind regards,
 Zach

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] [discussion] preemptive scheduling

2014-04-12 Thread Matthieu Monrocq

Hello,

As far as I know in Rust, a thread (green or not) that enters an infinite
loop without I/O is forever stuck. The only available option to stop it is
to have the OS kill the process (CTRL+C).

In my day job, all our servers services are time-bounded and any that
exceeds its time bound is killed. To do so requires one process per service
for the exact same reason than Rust, which has the unfortunate effect of
requiring a large memory footprint because utility threads (such as
timers, and notably the watch-dog timer) are replicated in each and every
process.

The most common source of time-slips are disk accesses and database
accesses which is covered by Rust under I/O, however I've already seen
infinite loops (or very long ones) and there seems to be no way to protect
against those. Of course one could recommend that such loops check a flag
or something, but if we knew those loops were going to diverge we would
fix them, not instrument them.


I was hoping that with Rust (which already rids us of good-bye to dangling
pointers  data races) we could move toward a single process with a lot of
concurrent (green) tasks for better efficiency and ease of development,
however the latter seems unattainable because of infinite loops or
otherwise diverging code right now.


I would thus also appreciate if anybody had an idea how to preempt a
misbehaving task, even if the only option is to trigger this task failure;
the goal at this point is to salvage the system without losing the current
workload.

-- Matthieu



On Sat, Apr 12, 2014 at 11:04 AM, Jeremy Ong jeremyc...@gmail.com wrote:

 I am considering authoring a webserver (think nginx, apache, cowboy,
 etc) in Rust. From a user point of view, mapping tasks (green) to web
 requests makes the most sense as the tasks could be long running,
 perform their own I/O, sessions, or what have you. It would also allow
 the user to do per-request in memory caching.

 My main concern is obviously the cooperative scheduler. Given that the
 mantra of Rust seems to be safety, I'm curious about how feasible it
 would be to provide the option for task safety as well. Preemptive
 scheduling provides two things:

 1. If preemption is used aggressively, the user can opt for a lower
 latency system (a la Erlang style round robin preemptive scheduling)
 2. Preemption of any sort can be used as a safety net to isolate bugs
 or blocks in tasks for long running systems, or at least mitigate
 damage until the developer intervenes.

 I noticed in issue 5731[1] on the repo, it was pointed out that this
 was possible, albeit difficult. The issue was closed with a comment
 that the user should use OS threads instead. I really think this
 misses the point as it no longer allows preemption on a smaller
 granularity scale. Could any devs chime in on the scope and difficulty
 of this project? Could any users/devs chime in on any of the points
 above?

 tl;dr I think preemptive scheduling is a must for safe concurrency in
 long running executables at the bottom of the stack. Opinions?


 [1] https://github.com/mozilla/rust/issues/5731
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Everything private by default

2014-03-28 Thread Matthieu Monrocq

On Thu, Mar 27, 2014 at 8:12 PM, Tommi rusty.ga...@icloud.com wrote:

 [The following post has nothing to do with thread. I'm posting it here
 because my new posts to this mailing list don't go through (this happens to
 me a lot). Replies to existing posts tend to go through, thus I'm hijacking
 my own thread.]

 Title: Compiling with no bounds checking for vectors?

 Why isn't there a compiler flag like 'noboundscheck' which would disable
 all bounds checking for vectors? It would make it easier to have those
 language performance benchmarks (which people are bound to make with no
 bounds checking in C++ at least) be more apples-to-apples comparisons.
 Also, knowing there's a flag in case you need one would put
 performance-critical people's mind at ease.


Because you can already have the functionality by using `unsafe`, so why
should one at *twice* the same functionality in different ways ?

I believe optimizers should be good enough to remove most bound checks
(especially in loops), and if there are cases where they don't it might be
worth checking what's preventing this optimization.



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Lightweight failure handling

2014-03-27 Thread Matthieu Monrocq

On Thu, Mar 27, 2014 at 3:43 PM, Clark Gaebel cg.wowus...@gmail.com wrote:

 aside: Your last message didn't get CC'd to rust-dev. I've re-added them,
 and hope dearly I haven't committed a social faux pas.

 That's interesting. You're kinda looking for exception handling in rust!
 Unfortunately the language seems pretty principled in its opinion that
 failure should be handled at the task boundary exclusively, and this is a
 pretty heavyweight opinion.

 This wouldn't be so bad if people would stop fail!ing everywhere! I'm
 personally very against the seemingly growing trend of people doing things
 like calling unwrap() on options instead of propagating errors up. This
 makes accidental failure far, far more common than it should be.

 I hope when higher-kinded-types and unboxed closures land, people will
 start using a monadic interface to results and options, as this will
 hopefully make error propagation less painful. We'll see.

 As for your specific case, I don't really have an answer. Is just don't
 call fail! an option? Maybe an automatically-inferred #[will_not_fail]
 annotation has a place in the world...

   - Clark


Actually, there is nothing in the task model that prevents them from being
run immediately in the same OS thread, and on the same stack. It just is
an implementation detail.

In the behavior, the main difference between try/catch in Java and a Task
in Rust is that a Task does not leave a half-corrupted environment when it
exits (because everything it interacted with dies with it).

Implementation-wise, there may be some hurdles to get a contiguous task
as cheap as a try/catch: unwind boundary, detecting that the task is viable
for that optimization at the spawn point, etc... but I can think of nothing
that is absolutely incompatible. I would be happy for a more knowledgeable
person to chime in on this point.

-- Matthieu




 On Thu, Mar 27, 2014 at 3:51 AM, Phil Dawes rustp...@phildawes.netwrote:

 Hi Clark,

 Thanks for the clarification.
 To follow your example, there are multiple 'process_msg()' steps, and if
 one fails I don't want it to take down the whole loop.

 Cheers,

 Phil



 On Wed, Mar 26, 2014 at 10:25 PM, Clark Gaebel cg.wowus...@gmail.comwrote:

 Sorry, was on my phone. Hopefully some sample code will better
 illustrate what I'm thinking:

 loop {
   let result : ResultFoo, () = task::try(proc() {
 loop {
   recv_msg(); // begin latency sensitive part
   process_msg();
   send_msg (); // end latency sensitive part
 }
   });

   if result.is_ok() {
 return result;
   } else {
 continue;
   }
 }

 This way, you only pay for the try if you have a failure (which should
 hopefully be infrequently), and you get nice task isolation!


 On Wed, Mar 26, 2014 at 6:05 PM, Clark Gaebel cg.wowus...@gmail.comwrote:

 The main loop of your latency sensitive application.
 On Mar 26, 2014 5:56 PM, Phil Dawes rustp...@phildawes.net wrote:

 On Wed, Mar 26, 2014 at 9:44 PM, Clark Gaebel 
 cg.wowus...@gmail.comwrote:

 Can't you put that outside your inner loop?


 Sorry Clark, you've lost me. Which inner loop?

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev




 --
 Clark.

 Key ID : 0x78099922
 Fingerprint: B292 493C 51AE F3AB D016  DD04 E5E3 C36F 5534 F907





 --
 Clark.

 Key ID : 0x78099922
 Fingerprint: B292 493C 51AE F3AB D016  DD04 E5E3 C36F 5534 F907

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Bounds on type variables in structs, enums, types

2014-03-25 Thread Matthieu Monrocq

On Tue, Mar 25, 2014 at 6:00 PM, Patrick Walton pcwal...@mozilla.comwrote:

 On 3/24/14 11:46 PM, Nick Cameron wrote:

 Currently we forbid bounds on type parameters in structs, enums, and
 types. So the following is illegal:

 struct SX: B {
  f: ~TX,
 }


 IIRC Haskell allows bounds on type parameters (and we did once too), but I
 heard that considered deprecated and not preferred. I don't recall the
 exact reasons, but that's why we removed the feature (and also just for
 language simplicity).

 Patrick


If I remember the reason cited in Haskell design it was that some functions
require more bounds than others.

For example a HashMap generally requires that the key be hashable somehow,
but the isEmpty or size functions on a HashMap have no such requirement.

Therefore, you would end up with a minimal bounds precised at Type level,
and then each function could add some more bounds depending on their needs:
that's 2 places to specify bounds. In the name of simplicity (and maximum
reusability of types) Haskell therefore advise to only use bounds on
functions.


However, I seem to remember than in Haskell the bounds are only Traits;
whereas in Rust some bounds may actually be required to be able to
instantiate the type (Sized ?).

-- Matthieu



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Structural Typing

2014-03-23 Thread Matthieu Monrocq

I would note that Rust macros are actually working with structural typing:
the expanded macro cannot be compiled unless the expressions/statements it
results in can be compiled.

Regarding Scala here, it seems a weird idea to ask that each and every
method should copy+paste the interface. We all know the woes of duplication.

Instead, you can define a Trait (even if for a single function) and it'll
just work; and when you add a second function you will be able to re-use
the same trait.


On Sun, Mar 23, 2014 at 11:37 AM, Liigo Zhuang com.li...@gmail.com wrote:

 IMO, this is bad.
 2014年3月23日 下午6:34于 Ziad Hatahet hata...@gmail.com写道：

 Hi all,

 Are there any plans to implement structural typing in Rust? Something
 like this Scala code: http://en.wikipedia.org/wiki/Duck_typing#In_Scala


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Virtual fn is a bad idea

2014-03-13 Thread Matthieu Monrocq

And of course I forgot to reply to the list at large... sorry :x

-- Matthieu


On Wed, Mar 12, 2014 at 8:48 PM, Matthieu Monrocq 
matthieu.monr...@gmail.com wrote:




 On Tue, Mar 11, 2014 at 10:18 PM, Patrick Walton pcwal...@mozilla.comwrote:

 On 3/11/14 2:15 PM, Maciej Piechotka wrote:

 Could you elaborate on DOM? I saw it referred a few times but I haven't
 seen any details. I wrote simple bindings to libxml2 dom
 (https://github.com/uzytkownik/xml-rs - warning - I wrote it while I was
 learning ruby) and I don't think there was a problem of OO - main
 problem was mapping libxml memory management and rust's one [I gave up
 with namespaces but with native rust dom implementation it would be
 possible to solve in nicer way]. Of course - I might've been at too
 early stage.


 You need:

 1. One-word pointers to each DOM node, not two. Every DOM node has 5
 pointers inside (parent, first child, last child, next sibling, previous
 sibling). Using trait objects would 10 words, not 5 words, and would
 constitute a large memory regression over current browser engines.

 2. Access to fields common to every instance of a trait without virtual
 dispatch. Otherwise the browser will be at a significant performance
 disadvantage relative to other engines.

 3. Downcasting and upcasting.

 4. Inheritance with the prefix property, to allow for (2).

 If anyone has alternative proposals that handle these constraints that
 are more orthogonal and are pleasant to use, then I'm happy to hear them.
 I'm just saying that dismissing the feature out of hand is not productive.


 Patrick


 Please excuse me, I need some kind of visualization here, so I concocted a
 simple tree:

 // So, in pseudo C++, let's imagine a DOM tree
 struct Element { Element *parent, *prevSib, *nextSib, *firstChild,
 *lastChild; uint leftPos, topPos, height, width; bool hidden; };
 struct Block: Element { BlockProperties blockP; }; struct Div: Block {};
 struct Inline: Element { InlineProperties inlineP; }; struct Span: Inline
 {};


 Now, I'll be basically mimicking the way LLVM structures its AST, since
 the LLVM AST achieves dynamic casting without RTTI. Note that this has a
 very specific downside: the hierarchy is NOT extensible.

 // And now in Rust (excuse my poor syntax/errors)
 enum ElementChild'r { ChildBlock('r Block), ChildInline('r Inline) }

 struct Element {
 child: Option'self ElementChild'self;
 parent: 'self Element;
 prevSib, nextSib, firstChild, lastChild: Option'self Element;
 leftPos, topPos, height, width: uint;
 hidden: bool;
 }


 enum BlockChild'r { ChildDiv('r Div) }

 struct Block {
 elementBase: Element;
 child: Option'self BlockChild'self;
 blockP: BlockProperties;
 }

 struct Div { blockBase: Block; }


 enum InlineChild'r { ChildSpan('r Span) }

 struct Inline {
 elementBase: Element;
 child: Option'self InlineChild'self;
 inlineP: InlineProperties;
 }

 struct Span { inlineBase: Inline; }


 Let us review our objectives:

 (1) One word to each DOM element: check = Option'r Element

 (2) Direct access to a field, without indirection: check =
 span.inlineBase.elementBase.hidden

 (3) Downcast and upcasting: check = downcast is done by matching:
 match(element.child) { ChildBlock('r block) = /* act on block */,
 ChildInline('r inline) = /* act on inline */); upcast is just accessing
 the base field.

 (4) Inheritance with the prefix property = not necessary, (2) is already
 satisfied.


 Note on (3): multiple bases are allowed easily, it's one field per base.


 In order to reduce the foot-print; avoiding having a child field at each
 level of the hierarchy might be beneficial. In this case, only the final
 classes are considered in ElementChild

 enum ElementChild'r { ChildDiv('r Div), ChildSpan('r Span) }

 And then downcasting to 'r Block is achieved by:

 match(element.final) { ChildDiv('r div) = Some('r div.blockBase), _ =
 None }


 I would note that this does not make use of traits at all; the analysis is
 only based on Patrick's list of objectives which I guess is incomplete and
 I was lacking a realistic example so it might not address the full scope of
 the problem...

 ... still, for CLOSED hierarchies, the use of traits should not be
 necessary, although it might be very convenient.

 -- Matthieu.


  ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev



___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Virtual fn is a bad idea

2014-03-13 Thread Matthieu Monrocq

Hi Eric,

Coming back on memory; I presented two designs:

- in the first one, you have a tag at each level of the hierarchy, which
indeed uses more memory for deep hierarchies but means that a type only
knows about its immediate children

- in the second one, you have a tag only at the root of the hierarchy,
which should use exactly as much memory as a v-table pointer (the fact
there is no v-table does not matter)

Regarding the boilerplate methods, my experience with LLVM is that with
virtual the root describes the interface and each descendant implements it
whereas in this system the root implements the interface for each and every
descendant... This can be alleviated by only dispatching to the immediate
descendants (and let them dispatch further) which is more compatible with
the memory-heavy design but also means multiple jumps at each call; not
nice.

However, once the interface is defined, user code should rarely have to go
and inspect the hierarchy by itself; this kind of down-casting should be
limited, as it is with regular inheritance in other languages.

-- Matthieu



On Thu, Mar 13, 2014 at 7:49 PM, Eric Summers eric.summ...@me.com wrote:

 Thinking about this a bit more, maybe the memory cost could go away with
 tagged pointers.  That is easier to do on a 64-bit platform though.

 Eric

 On Mar 13, 2014, at 1:37 PM, Eric Summers eric.summ...@me.com wrote:

  Yes, but with tags you pay the cost even if Option is None.
 
  Eric
 
  On Mar 13, 2014, at 1:33 PM, Daniel Micay danielmi...@gmail.com wrote:
 
  On 13/03/14 02:25 PM, Eric Summers wrote:
  Also this approach uses more memory.  At least a byte per pointer and
  maybe more with padding.  In most cases like this you would prefer to
  use a vtable instead of tags to reduce the memory footprint.
 
  Eric
 
  A vtable uses memory too. Either it uses a fat pointer or adds at least
  one pointer to the object.
 
 
  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] RFC: Opt-in builtin traits

2014-02-28 Thread Matthieu Monrocq

I must admit I really like the *regularity* this brings to Rust. There is
nothing more difficult to reason about that an irregular (even if
reasonable) interface simply because one must keep all the rules in mind at
any time (oh and sorry, there is a special condition described at page 364
that applies to this precise usecase even though the specs sounds like it's
a universal rule).

Certainly, the annotation could be a burden, but #[deriving(Data)] is
extremely terse and brings in almost anything a user could need for its
type in one shot.

Finally, I believe the public API stability this brings is very necessary.
Too often incidental properties are relied upon and broken during updates
with the author not realizing it; when it's explicit, at least the library
author makes a conscious choice.


Maybe one way of preventing completely un-annotated pieces of data would be
a lint that just checks that at least one property (Send, Freeze, ...) or a
special annotation denoting their absence has been selected for each
public-facing type. By having a #[deriving(...)] mandatory, it makes it
easier for the lint pass to flag un-marked types without even having to
reason whether or not the type would qualify.

-- Matthieu



On Fri, Feb 28, 2014 at 4:51 PM, Niko Matsakis n...@alum.mit.edu wrote:

 From 
 http://smallcultfollowing.com/babysteps/blog/2014/02/28/rust-rfc-opt-in-builtin-traits/
 :

 ## Rust RFC: opt-in builtin traits

 In today's Rust, there are a number of builtin traits (sometimes
 called kinds): `Send`, `Freeze`, `Share`, and `Pod` (in the future,
 perhaps `Sized`). These are expressed as traits, but they are quite
 unlike other traits in certain ways. One way is that they do not have
 any methods; instead, implementing a trait like `Freeze` indicates
 that the type has certain properties (defined below). The biggest
 difference, though, is that these traits are not implemented manually
 by users. Instead, the compiler decides automatically whether or not a
 type implements them based on the contents of the type.

 In this proposal, I argue to change this system and instead have users
 manually implement the builtin traits for new types that they define.
 Naturally there would be `#[deriving]` options as well for
 convenience. The compiler's rules (e.g., that a sendable value cannot
 reach a non-sendable value) would still be enforced, but at the point
 where a builtin trait is explicitly implemented, rather than being
 automatically deduced.

 There are a couple of reasons to make this change:

 1. **Consistency.** All other traits are opt-in, including very common
traits like `Eq` and `Clone`. It is somewhat surprising that the
builtin traits act differently.
 2. **API Stability.** The builtin traits that are implemented by a
type are really part of its public API, but unlike other similar
things they are not declared. This means that seemingly innocent
changes to the definition of a type can easily break downstream
users. For example, imagine a type that changes from POD to non-POD
-- suddenly, all references to instances of that type go from
copies to moves. Similarly, a type that goes from sendable to
non-sendable can no longer be used as a message.  By opting in to
being POD (or sendable, etc), library authors make explicit what
properties they expect to maintain, and which they do not.
 3. **Pedagogy.** Many users find the distinction between pod types
(which copy) and linear types (which move) to be surprising. Making
pod-ness opt-in would help to ease this confusion.
 4. **Safety and correctness.** In the presence of unsafe code,
compiler inference is unsound, and it is unfortunate that users
must remember to opt out from inapplicable kinds. There are also
concerns about future compatibility. Even in safe code, it can also
be useful to impose additional usage constriants beyond those
strictly required for type soundness.

 I will first cover the existing builtin traits and define what they
 are used for. I will then explain each of the above reasons in more
 detail.  Finally, I'll give some syntax examples.

 !-- more --

  The builtin traits

 We currently define the following builtin traits:

 - `Send` -- a type that deeply owns all its contents.
   (Examples: `int`, `~int`, not `int`)
 - `Freeze` -- a type which is deeply immutable when accessed via an
   `T` reference.
   (Examples: `int`, `~int`, `int`, `mut int`, not `Cellint` or
`Atomicint`)
 - `Pod` -- plain old data which can be safely copied via memcpy.
   (Examples: `int`, `int`, not `~int` or `mut int`)

 We are in the process of adding an additional trait:

 - `Share` -- a type which is threadsafe when accessed via an `T`
   reference. (Examples: `int`, `~int`, `int`, `mut int`,
   `Atomicint`, not `Cellint`)

  Proposed syntax

 Under this proposal, for a struct or enum to be considered send,
 freeze, pod, etc, those traits must be explicitly

Re: [rust-dev] Fwd: user input

2014-02-09 Thread Matthieu Monrocq

On Sun, Feb 9, 2014 at 12:15 PM, Renato Lenzi rex...@gmail.com wrote:



 Always talking about read  write i noticed another interesting thing:

 use std::io::buffered::BufferedReader;
 use std::io::stdin;

 fn main()
 {
 print!(Insert your name: );
 let mut stdin = BufferedReader::new(stdin());
 let s1 = stdin.read_line().unwrap_or(~nothing);
 print!(Welcome, {}, s1);
 }

 when i run this simple code the output Insert your name doesn't appear
 on the screen... only after typing and entering a string the whole output
 jumps out... am i missing some flush (ala Fantom) or similar? I am using
 Rust 0.9 on W7.


Ah, that's interesting. In most languages whenever you ask for user input
(read on stdin) it automatically triggers a flush on stdout and stderr to
avoid this uncomfortable situation.

I suppose it would not be took difficult to incorporate this in Rust.

-- Matthieu.




 On Sun, Feb 9, 2014 at 2:40 AM, Patrick Walton pcwal...@mozilla.comwrote:

 On 2/8/14 3:35 PM, Alex Crichton wrote:

 We do indeed want to make common tasks like this fairly lightweight,
 but we also strive to require that the program handle possible error
 cases. Currently, the code you have shows well what one would expect
 when reading a line of input. On today's master, you might be able to
 shorten it slightly to:

  use std::io::{stdin, BufferedReader};

  fn main() {
  let mut stdin = BufferedReader::new(stdin());
  for line in stdin.lines() {
  println!({}, line);
  }
  }

 I'm curious thought what you think is the heavy/verbose aspects of
 this? I like common patterns having shortcuts here and there!


 Is there any way we can get rid of the need to create a buffered reader?
 It feels too enterprisey.

 Patrick


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev




 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Using Default Type Parameters

2014-02-03 Thread Matthieu Monrocq

On Mon, Feb 3, 2014 at 8:41 AM, Gábor Lehel glaebho...@gmail.com wrote:

 On Mon, Feb 3, 2014 at 7:55 AM, Corey Richardson co...@octayn.net wrote:

 Default typarams are awesome, but they're gated, and there's some
 concern that they'll interact unpleasantly with extensions to the type
 system (most specifically, I've seen concern raised around HKT, where
 there is conflicting tension about whether to put the defaults at
 the start or end of the typaram list).


 Just for reference, this was discussed here:
 https://github.com/mozilla/rust/pull/11217

 (The tension is essentially that with default type args you want to put
 the least important types at the end, so they can be defaulted, while
 with HKT you want to put them at the front, so they don't get in the way of
 abstracting over the important ones.)


Thinking out loud: could parameters be keyed, like named functions
arguments ? If they were, then their position would matter little.

-- Matthieu


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Proposal: Change Parametric Polymorphism Declaration Syntax

2014-02-02 Thread Matthieu Monrocq

On Sun, Feb 2, 2014 at 6:08 PM, Benjamin Striegel ben.strie...@gmail.comwrote:

 After sleeping on it I'm not convinced that this would be a net
 improvement over our current situation. With a few caveats I'm really
 rather happy with the syntax as it is.


 On Sun, Feb 2, 2014 at 8:55 AM, Jason Fager jfa...@gmail.com wrote:

 I'm not a huge fan of this proposal.  It makes declarations longer, and
 it removes the visual consistency of FooT,U everywhere, which I think
 introduces its own pedagogical issue.

 The recent addition of default type parameters, though, makes me think
 there's a reasonable change that increases consistency and shortens
 declarations in a few common cases.

 From what I understand, the reason we can't just have

 impl TraitT for FooT,U

 is because it's ambiguous whether T and U are intended to be concrete or
 generic type names; i.e.,

 implT TraitT for FooT,U

 tells the compiler that we expect U to be a concrete type name.

 Our new default type parameter declarations look like:

 struct FooT,U=Bar

 So what if to actually make generic types concrete, we always used the
 '='?

 struct FooT,U=Bar
 impl TraitT for FooT, U=Derp

 This saves a character over 'implT TraitT for FooT, Derp', solves
 the greppability problem, and makes intuitive sense given how defaults are
 declared.

 It also has a nice parallel with how ':' is used - ':' adds restrictions,
 '=' fully locks in place.  So what is today something like

 implT:Ord TraitT for FooT, Derp

 would become

 impl TraitT:Ord for FooT, U=Derp

 The rule would be that the first use of a type variable T would introduce
 its bounds, so for instance:

 impl TraitT:Ord for FooZ:Clone, U=Derp

 would be fine, and

 impl TraitT for FooT:Clone, U=Derp

 would be an error.

 More nice fallout:

 struct FooA,B
 impl FooA,B=Bar {
 fn one(a: A) - B
 fn two(a: A) - B
 fn three(a: A) - B
 }

 means that if I ever want to go back and change the name of Bar, I only
 have to do it in one place, or if Bar is actually some complicated type, I
 only had to write it once, like a little local typedef.

 I'm sure this has some glaring obvious flaw I'm not thinking of.  It
 would be nice to have less syntax for these declarations, but honestly I'm
 ok with how it is now.














 On Sat, Feb 1, 2014 at 5:39 PM, Corey Richardson co...@octayn.netwrote:

 Hey all,

 bjz and I have worked out a nice proposal[0] for a slight syntax
 change, reproduced here. It is a breaking change to the syntax, but it
 is one that I think brings many benefits.

 Summary
 ===

 Change the following syntax:

 ```
 struct FooT, U { ... }
 implT, U TraitT for FooT, U { ... }
 fn fooT, U(...) { ... }
 ```

 to:

 ```
 forallT, U struct Foo { ... }
 forallT, U impl TraitT for FooT, U { ... }
 forallT, U fn foo(...) { ... }
 ```


From a readability point of view, I am afraid this might be awkward though.

Coming from a C++, I have welcome the switch from `typedef` to `using`
(aliases) because of alignment issues; consider:

typedef std::mapint, std::string MapType;
typedef std::vectorstd::pairint, std::string VectorType;

vs

using MapType = std::mapint, std::string;
using VectorType = std::vectorstd::pairint, std::string;

In the latter, the entities being declared are at a constant offset from
the left-hand margin; and close too; whereas in the former, the eyes are
strained as they keep looking for what is declared.


And now, let's look at your proposal:

fn foo(a: int, b: int) - int { }

fn fooT, U(a: T, b: U) - T { }

forallT, U fn foo(a: T, b: U) - T { }

See how forall causes a bump that forces you to start looking where
that name is ? It was so smooth until then !


So, it might be a net win in terms of grep-ability, but to be honest it
seems LESS readable to me.

-- Matthieu


 The Problem
 ===

 The immediate, and most pragmatic, problem is that in today's Rust one
 cannot
 easily search for implementations of a trait. Why? `grep 'impl Clone'` is
 itself not sufficient, since many types have parametric polymorphism.
 Now I
 need to come up with some sort of regex that can handle this. An easy
 first-attempt is `grep 'impl(.*?)? Clone'` but that is quite
 inconvenient to
 type and remember. (Here I ignore the issue of tooling, as I do not find
 the
 argument of But a tool can do it! valid in language design.)

 A deeper, more pedagogical problem, is the mismatch between how `struct
 Foo... { ... }` is read and how it is actually treated. The
 straightforward,
 left-to-right reading says There is a struct Foo which, given the types
 ...
 has the members  This might lead one to believe that `Foo` is a
 single
 type, but it is not. `Fooint` (that is, type `Foo` instantiated with
 type
 `int`) is not the same type as `Foounit` (that is, type `Foo`
 instantiated
 with type `uint`). Of course, with a small amount of experience or a very
 simple explanation, that becomes obvious.

 Something less obvious is the treatment of functions. What

Re: [rust-dev] What of semi-automated segmented stacks ?

2014-01-30 Thread Matthieu Monrocq

On Thu, Jan 30, 2014 at 6:33 PM, Daniel Micay danielmi...@gmail.com wrote:

 On Thu, Jan 30, 2014 at 12:27 PM, Matthieu Monrocq
 matthieu.monr...@gmail.com wrote:
  Hello,
 
  Segmented stacks were ditched because of performance issues that were
 never
  fully resolved, especially when every opaque call (C, ...) required
  allocated a large stack up-front.
 
  Still, there are platforms (FreeBSD) with small stacks where the idea of
  segmented tasks could ease development... so what if we let the developer
  ship in ?

 Rust can and does choose the stack size itself. This can exposed as an
 API feature too.


I think it would be a good idea, to avoid platform defaults causing
unexpected crashes. I know Clang regularly suffers on a number of tests
because of this.

Still, this seems complementary. Whilst a large stack to begin with is an
obvious option, there are always unfavorable cases. Today, to avoid stack
issues, I have to move from natural recursive style to self-managed stack
of actions and an endless loop so my stack is actually on the heap. It's
feasible, certainly, but it's a technical limitation getting in the way of
my intent.

And unfortunately, whilst I could allocate a 1GB stack to start with (64
bits world sure is fortunate), I have no way to foresee when I will need
such a stack and when I do not. Dynamic adaptation makes things much easier.


  The idea of semi-automated segmented stacks would be:
 
  - to expose to the user how many bytes worth of stack are remaining
 
  - to let the user trigger a stack switch
 
 
  This system should keep the penalty close to null for those who do not
 care,
  and be relatively orthogonal to the rest of the implementation:

 If Rust isn't going to be using the segmented stack prelude (1-5%
 performance hit), it needs guard pages. This means the smallest stack
 segment size you can have with a free solution is 8K. It will
 consume less virtual memory than a fixed-size stack, but not more
 physical memory.


  - how many bytes remaining carries little to no penalty: just a pointed
  substraction between the current stack pointer and the end-of-stack
  pointer (which can be set once and for all at thread start-up)
 
  - the stack switch is voluntary, and can include a prelude on the new
 stack
  that automatically comes back to its parent so that most code should not
  care, no penalty in regular execution (without it)
 
  - I foresee some potential implementation difficulty for the unwinder,
 did
  it ever work on segmented stacks ? Was it difficult/slow ? Does
 performance
  of unwind matter that much ?

 Unwind performance doesn't matter, and is already really slow by design.

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Today's Rust contribution ideas

2014-01-28 Thread Matthieu Monrocq

On Mon, Jan 27, 2014 at 11:41 PM, Sebastian Sylvan
sebastian.syl...@gmail.com wrote:

On Mon, Jan 27, 2014 at 9:33 AM, Matthieu Monrocq
matthieu.monr...@gmail.com wrote:

On Mon, Jan 27, 2014 at 3:39 AM, Brian Anderson bander...@mozilla.comwrote:

Consensus is that the `do` keyword is no longer pulling its weight.
Remove all uses of it, then remove support from the compiler. This is a 1.0
issue.

# Experiment with faster hash maps (#11783)

Rust's HashMap uses a cryptographically secure hash, and at least partly
as a result of that it is quite slow. HashMap continues to show up very,
very high in performance profiles of a variety of code. It's not clear what
the solution to this is, but it is clear that - at least sometimes - we
need a much faster hash map solution. Figure out how to create faster hash
maps in Rust, potentially sacrificing some amount of DOS-resistance by
using weaker hash functions. This is fairly open-ended and researchy, but a
solution to this could have a big impact on the performance of rustc and
other projects.

You might be interested by a serie of articles by Joaquín M López Muñoz
who maintains the Boost.MultiIndex library. He did a relatively
comprehensive overview of the hash-maps implementation of Dirkumware
(MSVC), libstdc++ and libc++ on top of Boost.MultiIndex, and a lot of
benchmarks showing the performance for insertion/removal/search in a
variety of setup.

One of the last articles:
http://bannalia.blogspot.fr/2014/01/a-better-hash-table-clang.html

Let me also plug this blog post from a while back:
http://sebastiansylvan.com/2013/05/08/robin-hood-hashing-should-be-your-default-hash-table-implementation/.
There's also a followup on improving deletions*, which makes the final
form the fastest hash map I know of. It's also compact (95% load factor, 32
bits overhead per element, but you can reduce that to 2 bits per element if
you sacrifice some perf.), and doesn't allocate (other than doubling the
size of the table when you hit the load factor).

For a benchmark with lots of std::strings it was 23%, 66% and 25% faster
for insertions deletions and lookups (compared to MSVC unordered_map), it
also uses 30% less memory in that case.

Seb

* the basic form has an issue where repeated deletes gradually increases
the probe count. In pathological cases this can reduce performance by a
lot. The fix is to incrementally fix up the table on each delete (you could
also do it in batch every now and then). It's still faster in all cases,
and the probe-length as well as probe-length-variance remains low even in
the most pathological circumstances.

Thanks for the link, I should have mentioned that the C++ Standard version
is constrained by a memory stability requirement which may or may not apply
to Rust (thanks to borrow checks, it should be possible to know statically
whether an element is borrowed or not). This memory stability requirement
as well as some other requirements such as relative stability of items
within the same equivalence class during insert/erase several constrain the
design; and indeed if the requirements can be lifted it the designs
proposed on bannalia will be suboptimal.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Today's Rust contribution ideas

2014-01-27 Thread Matthieu Monrocq

On Mon, Jan 27, 2014 at 3:39 AM, Brian Anderson bander...@mozilla.comwrote:

 People interested in Rust are often looking for ways to have a greater
 impact on its development, and while the issue tracker lists lots of stuff
 that one *could* work on, it's not always clear what one *should* work on.
 There is consistently an overwhelming number of very important tasks to do
 which nobody is tackling, so this is an effort to update folks on what
 high-impact, yet accessible, contribution opportunities are available.
 These are of varying difficulty, but progress on any of them is worthy of
 *extreme kudos*.

 # Break up libextra (#8784)

 Getting our library ecosystem in shape in critical for Rust 1.0. We want
 Rust to be a batteries included language, distributed with many crates
 for common uses, but the way our libraries are organized - everything
 divided between std and extra - has long been very unsatisfactory. libextra
 needs to be split up into a number of subject-specific crates, setting the
 precedent for future expansion of the standard libraries, and with the
 impending merging of #11787 the floodgates can be opened.

 This is simply a matter of identifing which modules in extra logically
 belong in their own libraries, extracting them to a directory in src/, and
 adding a minimal amount of boilerplate to the makefiles. Multiple people
 can work on this, coordinating on the issue tracker.

 # Improve the official cheatsheet

 We have the beginnings of a 'cheatsheet', documenting various common
 patterns in Rust code (http://static.rust-lang.org/doc/master/complement-
 cheatsheet.html), but there is so much more that could be here. This
 style of documentation is hugely useful for newcomers. There are a few ways
 to approach this: simply review the current document, editing and
 augmenting the existing examples; think of the questions you had about Rust
 when you started and add them; solicit questions (and answers!) from the
 broader community and them; finally, organize a doc sprint with several
 people to make some quick improvements over a few hours.

 # Implement the `Share` kind (#11781)

 Future concurrency code is going to need to reason about types that can be
 shared across threads. The canonical example is fork/join concurrency using
 a shared closure, where the closure environment is bounded by `Share`. We
 have the `Freeze` kind which covers a limited version of this use case, but
 it's not sufficient, and may end up completely supplanted by `Share`. This
 is quite important to have sorted out for 1.0 but the design is not done
 yet. Work with other developers to figure out the design, then once that's
 done the implementation - while involving a fair bit of compiler hacking
 and library modifications - should be relatively easy.

 # Remove `do` (#10815)

 Consensus is that the `do` keyword is no longer pulling its weight. Remove
 all uses of it, then remove support from the compiler. This is a 1.0 issue.

 # Experiment with faster hash maps (#11783)

 Rust's HashMap uses a cryptographically secure hash, and at least partly
 as a result of that it is quite slow. HashMap continues to show up very,
 very high in performance profiles of a variety of code. It's not clear what
 the solution to this is, but it is clear that - at least sometimes - we
 need a much faster hash map solution. Figure out how to create faster hash
 maps in Rust, potentially sacrificing some amount of DOS-resistance by
 using weaker hash functions. This is fairly open-ended and researchy, but a
 solution to this could have a big impact on the performance of rustc and
 other projects.


You might be interested by a serie of articles by Joaquín M López Muñoz who
maintains the Boost.MultiIndex library. He did a relatively comprehensive
overview of the hash-maps implementation of Dirkumware (MSVC), libstdc++
and libc++ on top of Boost.MultiIndex, and a lot of benchmarks showing the
performance for insertion/removal/search in a variety of setup.

One of the last articles:
http://bannalia.blogspot.fr/2014/01/a-better-hash-table-clang.html



 # Replace 'extern mod' with 'extern crate' (#9880)

 Using 'extern mod' as the syntax for linking to another crate has long
 been a bit cringeworthy. The consensus here is to simply rename it to
 `extern crate`. This is a fairly easy change that involves adding `crate`
 as a keyword, modifying the parser to parse the new syntax, then changing
 all uses, either after a snapshot or using conditional compilation. This is
 a 1.0 issue.

 # Introduce a design FAQ to the official docs (#4047)

 There are many questions about languages' design asked repeatedly, so they
 tend to have documents simply explaining the rationale for various
 decisions. Particularly as we approach 1.0 we'll want a place to point
 newcomers to when these questions are asked. The issue on the bug tracker
 already contains quite a lot of questions, and some answers as well. Add a
 new Markdown file to the doc/

Re: [rust-dev] Appeal for CORRECT, capable, future-proof math, pre-1.0

2014-01-14 Thread Matthieu Monrocq

On Tue, Jan 14, 2014 at 5:56 AM, comex com...@gmail.com wrote:

 On Mon, Jan 13, 2014 at 4:06 PM, Tobias Müller trop...@bluewin.ch wrote:
  intl1,u1 + intl2,u2 = intl1+l2,u1+u2
  ...
 
  If the result does not fit into an int the compiler throws an error.
  To resolve an error, you can:
  - annotate the operands with appropriate bounds
  - use a bigger type for the operation and check the result.

 I remember wondering whether this type of solution would be feasible
 or too much of a hassle in practice.  As I see it, many values which
 might be arithmetic operands are sizes or counts, and really ought to
 be size_t sized, and any mutable variable which is operated on in a
 loop can't be bounded with a lot more complexity, so it might lean
 toward the latter.


It's indeed a risk that such an annotation might be too annoying
(especially since addition is actually quite easy, the bounds grow faster
on multiplication)... but on the other hand, you do need dynamic checks
anyway to verify that the value of type u320, 4_294_967_295 won't
overflow if you multiply it by 3.

So as I see it, you can do either of: let result = tou320,
1_431_655_765(size)
* 3; OR let result = tou32(tou64(size) * 3);.

Of course, compared to let result = size * 3; it seems the annotation tax
is high, however the latter may overflow (and wrap, certainly, but that is
still a bogus answer in most languages).

So, maybe it one could just use a couple primitives:

- wrapping integers (for hashes)
- saturating integers (useful for colors)
- fail-on-overflow integers
- compile-time range-checked integers

u32w, u32s, u32o and u32c ?


Note: as far as I know Rust *plans* on having non-type template parameters
but does not have them yet, so the compile-time range-checked integers are
out of question for now.

Note 2: having all those in the core language would be unnecessary if the
syntax 3u32c (numbertype) was sugar coating for u32c::new(3) like C++
suffix literals; with new using some default integer type (I vote for
the fail-on-overflow, it catches the bugs) and the compiler verifying that
the raw number can be expressed in that default integer type perfectly.
Then libraries could add the other modes.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-31 Thread Matthieu Monrocq

On Tue, Dec 31, 2013 at 6:16 AM, Patrick Walton pcwal...@mozilla.comwrote:

 Can someone address Simon Marlow's point here?

 https://plus.google.com/10955911385859313/posts/FAmNTExSLtz

  unbuffered channels are synchronous in the sense that both reader
  and writer must be ready at the same time. It's easy to deadlock if
  you're not careful. Buffered channels allow asynchronous writes, but
  only up to the buffer size, so that doesn't actually make things
  easier. Fully asynchronous channels, like you get in Erlang and
  Haskell don't have this problem, but they are unbounded so you have
  to be careful about filling them up (Erlang uses a clever scheduling
  trick to mitigate that problem, though).


I am concerned that we are only hearing one side of the argument here, and
 Haskell folks seem to have come down fairly strongly in favor of unbounded
 channels.

 It also seems to me that the argument is partial, and only consider
blocking sends. It would be interesting to know if they envisaged
non-blocking sends and if so why they seem to have been discarded.


 To reiterate: At this point I believe we should have both as first-class
 citizens, like `java.util.concurrent`. Choosing one or the other seems to
 be neglecting too many use cases.


 Patrick

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-31 Thread Matthieu Monrocq

On Tue, Dec 31, 2013 at 6:46 PM, Patrick Walton pcwal...@mozilla.comwrote:

 On 12/30/13 8:46 PM, Christian Ohler wrote:

 To address the last sentence – bounded channels with default size 0
 _do_ minimize the fallout of this design: The program would reliably
 deadlock every time it is tested with a nonzero number of images,
 since A will try to write to Images while B is blocked receiving
 from Done, not listening on Images yet.  I don't see this deadlock
 as a nasty hazard – the code wouldn't work at all, and the programmer
 would immediately notice.  If the programmer uses a non-zero buffer
 size for the channel, it's a magic number that they came up with, so
 they should know to test inputs around that magnitude.


 I suspect a lot of programmers in systems with bounded channels just come
 up with some round number (like 10) and forget about it. Similar to the
 argument to listen(2)...

 Patrick


Anecdotal evidence: I work with distributed systems, and most of our
limits are in fact completely winged and rarely if ever touched... except
after an issue where we realize we could do better.

This is the kind of things where you don't have enough experience with the
system as you first write it, so you put some reasonable limits, and then
just forget that you needed to come back it and check if it really
worked... but then, on the other hand, if it passes the testing isn't it
that it works well enough ?


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-19 Thread Matthieu Monrocq

On Thu, Dec 19, 2013 at 7:23 PM, Kevin Ballard ke...@sb.org wrote:

 Here’s an example from where I use an infinite queue.

 I have an IRC bot, written in Go. The incoming network traffic of this bot
 is handled in one goroutine, which parses each line into its components,
 and enqueues the result on a channel. The channel is very deliberately made
 infinite (via a separate goroutine that stores the infinite buffer in a
 local slice). The reason it’s infinite is because the bot needs to be
 resilient against the case where either the consumer unexpectedly blocks,
 or the network traffic spikes. The general assumption is that, under normal
 conditions, the consumer will always be able to keep up with the producer
 (as the producer is based on network traffic and not e.g. a tight CPU loop
 generating messages as fast as possible). Backpressure makes no sense here,
 as you cannot put backpressure on the network short of letting the socket
 buffer fill up, and letting the socket buffer fill up with cause the IRC
 network to disconnect you. So the overriding goal here is to prevent
 network disconnects, while assuming that the consumer will be able to catch
 up if it ever gets behind.

 This particular use case very explicitly wants a dynamically-sized
 infinite channel. I suppose an absurdly large channel would be acceptable,
 because if the consumer ever gets e.g. 100,000 lines behind then it’s in
 trouble already, but I’d rather not have the memory overhead of a
 statically-allocated gigantic channel buffer.


I feel the need to point out that the producer could locally queue the
messages before sending over the channel if it were bounded.



 -Kevin

 On Dec 19, 2013, at 10:04 AM, Jason Fager jfa...@gmail.com wrote:

 Okay, parallelism, of course, and I'm sure others.  Bad use of the word
 'only'.  The point is that if your consumers aren't keeping up with your
 producers, you're screwed anyways, and growing the queue indefinitely isn't
 a way to get around that.  Growing queues should only serve specific
 purposes and make it easy to apply back pressure when the assumptions
 behind those purposes go awry.


 On Thursday, December 19, 2013, Patrick Walton wrote:

 On 12/19/13 6:31 AM, Jason Fager wrote:

 I work on a system that handles 10s of billions of events per day, and
 we do a lot of queueing.  Big +1 on having bounded queues.  Unbounded
 in-memory queues aren't, they just have a bound you have no direct
 control over and that blows up the world when its hit.

 The only reason to have a queue size greater than 1 is to handle spikes
 in the producer, short outages in the consumer, or a bit of
 out-of-phaseness between producers and consumers.


 Well, also parallelism.

 Patrick

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

  ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Unbounded channels: Good idea/bad idea?

2013-12-19 Thread Matthieu Monrocq

Also working in a distributed system, I cannot emphasize enough how back
pressure is essential.

With back pressure, you offer the producer a chance to react: it can decide
to drop the message, send it over another channel, keep it around for
later, etc...

Furthermore, it is relatively easy to build an unbounded channel over a
bounded one: just have the producer queue things. Depending on whether
sequencing from multiple producers is important or not, this queue can be
either shared or producer-local, with relative ease.


Regarding the various behaviors that may be implemented, most behaviors can
actually be implemented outside of the channel implementation:

+ dropping the message can be implemented on producer side: if it cannot
queue, it just goes on
+ crashing is similar: if it cannot queue, crash
+ blocking is generally a good idea, but if a timed-wait primitive exists
then I imagine an infinite (or close enough) duration would be sufficient


So it might be more interesting to reason in terms of primitives, and those
might be more methods than types (hopefully):

(1) immediate queueing (returning an error), a special case of time-bound
queueing which may be slightly more efficient
(2) time-bound queueing (returning an error after the timeout)
(3) immediate + exchange with head (in which case the producer also locally
acts as a consumer, this might be tricky to pull off efficiently on Single
Consumer queues)
(4) immediate + atomic subscription to place has been freed event in case
of full-queue

(Note: (4) somehow implies a dual channel, if you have a MPSC a
back-channel SPMC is created to dispatch the space available
notifications... which can be a simple counter, obviously; this
back-channel must be select-able so that producers that usually block on
other stuff can use a space available event to unblock)

I cannot see another interesting primitive, at the moment.

-- Matthieu


On Thu, Dec 19, 2013 at 7:25 PM, Matthieu Monrocq 
matthieu.monr...@gmail.com wrote:




 On Thu, Dec 19, 2013 at 7:23 PM, Kevin Ballard ke...@sb.org wrote:

 Here’s an example from where I use an infinite queue.

 I have an IRC bot, written in Go. The incoming network traffic of this
 bot is handled in one goroutine, which parses each line into its
 components, and enqueues the result on a channel. The channel is very
 deliberately made infinite (via a separate goroutine that stores the
 infinite buffer in a local slice). The reason it’s infinite is because the
 bot needs to be resilient against the case where either the consumer
 unexpectedly blocks, or the network traffic spikes. The general assumption
 is that, under normal conditions, the consumer will always be able to keep
 up with the producer (as the producer is based on network traffic and not
 e.g. a tight CPU loop generating messages as fast as possible).
 Backpressure makes no sense here, as you cannot put backpressure on the
 network short of letting the socket buffer fill up, and letting the socket
 buffer fill up with cause the IRC network to disconnect you. So the
 overriding goal here is to prevent network disconnects, while assuming that
 the consumer will be able to catch up if it ever gets behind.

 This particular use case very explicitly wants a dynamically-sized
 infinite channel. I suppose an absurdly large channel would be acceptable,
 because if the consumer ever gets e.g. 100,000 lines behind then it’s in
 trouble already, but I’d rather not have the memory overhead of a
 statically-allocated gigantic channel buffer.


 I feel the need to point out that the producer could locally queue the
 messages before sending over the channel if it were bounded.



 -Kevin

 On Dec 19, 2013, at 10:04 AM, Jason Fager jfa...@gmail.com wrote:

 Okay, parallelism, of course, and I'm sure others.  Bad use of the word
 'only'.  The point is that if your consumers aren't keeping up with your
 producers, you're screwed anyways, and growing the queue indefinitely isn't
 a way to get around that.  Growing queues should only serve specific
 purposes and make it easy to apply back pressure when the assumptions
 behind those purposes go awry.


 On Thursday, December 19, 2013, Patrick Walton wrote:

 On 12/19/13 6:31 AM, Jason Fager wrote:

 I work on a system that handles 10s of billions of events per day, and
 we do a lot of queueing.  Big +1 on having bounded queues.  Unbounded
 in-memory queues aren't, they just have a bound you have no direct
 control over and that blows up the world when its hit.

 The only reason to have a queue size greater than 1 is to handle spikes
 in the producer, short outages in the consumer, or a bit of
 out-of-phaseness between producers and consumers.


 Well, also parallelism.

 Patrick

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

  ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https

Re: [rust-dev] Idea for versioned language specifications with automatic conversions

2013-11-24 Thread Matthieu Monrocq

Hi Manuel,

I must say that on a conceptual point of view I like the approach, keeping
one's libraries up to date is the only way to go, however I am afraid that
you are glossing over certain details here:

- you assume that the source code is available, this is a problem if I am
using a 3rd party library for which I only get the binary and THEY have not
migrated yet; how can I use library X (released in 0.9 and 0.10) and
library Y (released in 0.11 and 0.12) in the same project ? Smaller
milestones make it a smoother process to upgrade at the individual level
but larger milestones help multiple people/corporations coordinate.

- you assume that I can actually upgrade; I work at a large software
company, with over 5,000 employees now, and this apply a *large* source
code base. A migration entails an extensive test phase of the target
software/version following by a careful migration of a few pilot products
simply because migrating costs a lot and migrating to a flawed version
just to rollback the migration is a cost sink. As a result though, this
creates inertia. Internally we are *always* in the middle of several
migrations (compiler, 3rd party libraries, in-house middleware, ...) and
the larger ones take years. Because of this, once again we need some
coordination: we just cannot afford to migrate every 6 months (not enough
testing time). This means that while it would not prevent Rust from
migrating every 6 months, we would still be expecting fixes to previous
releases for a year or two.

The former means that 6 months might a little *too* fast pace for
industrial projects, the latter means that on top of defining releases
schedule the Rust team will also have to provide a clear plan for support
of older versions (how long, what kind of bugs, ...) and the number of
branches impacted may grow quickly: 6 months releases + 2 years support
means at least 4 branches, maybe 5 if we count the one being developed (and
2 years is nothing fancy, as support goes).

-- Matthieu



On Sun, Nov 24, 2013 at 11:49 AM, Manuel ma.adam...@gmail.com wrote:

  I had the following idea to approach language evolution:

 Problem:
 Languages try to be backward compatible by stabilizing, and only slowly
 deprecating old features. This results in a language which does not evolve.
 Some different takes about this:
 C++: adds new features but does not fix problems, and often does not
 remove obsolete features resulting in, well, C++.
 Python: Minor versions which add new features, big version jump from 2 to
 3 to make backward incompatible changes. The resulting incompatibility was
 a big problem, almost 5 years after the release of 3.0 (December 3rd,
 2008) people are still using 2.x. Rust seems to follow a similar
 approach, devs are already defering features to 2.0 to stabilize.
 Other languages simply do not evolve at all and are replaced.

 My idea to improve this situation would be to add a version tag in every
 main crate, something like #ver 0.10. For each version jump the compiler
 would fix the code automatically, and convert it to the current language
 specification. When the library/code is multiple versions behind the
 conversions could be applied successively. This can be done in a lot of
 cases, see python 2to3 script and even Google did this for go with the tool
 gofix during development. With this change not updated libraries would
 still be usable in rust. To simplify updating libraries the compiler could,
 on demand, print out a report of problematic parts and propose fixes. Some
 things can not be fixed with an automatic approach, for these cases a
 classic deprecation mechanism or something else could still be used.

 Advantages:
 Kind of backward compatibility to old code bases.
 Rust can evolve and stay streamlined at the same time.
 Compiler does not have to deal with deprecation mechanism, because you can
 remove, and change things instantly.

 When this would be in place i think it would be best to release
 incompatible updates often, but with only a few changes. Every six months
 for example.


 What do you think about this?

 Manuel



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Faster communication between tasks

2013-11-10 Thread Matthieu Monrocq

On Sat, Nov 9, 2013 at 8:13 PM, Simon Ruggier simo...@gmail.com wrote:

 Hi all, I've tentatively come up with a design that would allow the sender
 to reallocate the buffer as necessary, with very little added performance
 cost. The sending side would bear the cost of reallocation, and there would
 be an extra test that receivers would have to make every time they process
 an item (no extra atomic operations needed). However, it may be a few weeks
 or more before I have a working implementation to demonstrate, so I figured
 it might be worthwhile to mention now that I'll be working on this.

 Also, I think it would be interesting to investigate doing something like
 the Linux kernel's deadlock detection[1], but generalized to apply to
 bounded queues, and implemented as a static check. I know little about
 this, but even so, I can see how it would be an enormous amount of work. On
 the other hand, I would have thought the same thing about the memory safety
 rules that Rust enforces. I'm hopeful that this will eventually be possible
 as well.

 [1] https://www.kernel.org/doc/Documentation/lockdep-design.txt


A static proof seems extremely difficult; it would be a significant
addition to the type system, affecting the closure types (did they, or not,
embedded a channel/port at creation ?). In addition, I am unsure of how
transfer of closures through channels would pan out.

On the other hand, dynamic detection (such as done on @ pointers for
mutability), seems possible.

-- Matthieu


 On Wed, Oct 30, 2013 at 12:55 AM, Simon Ruggier simo...@gmail.com wrote:

 On Tue, Oct 29, 2013 at 3:30 PM, Brian Anderson bander...@mozilla.comwrote:

  On 10/28/2013 10:02 PM, Simon Ruggier wrote:

 Greetings fellow Rustians!

 First of all, thanks for working on such a great language. I really like
 the clean syntax, increased safety, separation of data from function
 definitions, and freedom from having to declare duplicate method prototypes
 in header files.

 I've been working on an alternate way to communicate between tasks in
 Rust, following the same approach as the LMAX Disruptor.[1] I'm hoping to
 eventually offer a superset of the functionality in the pipes API, and
 replace them as the default communication mechanism between tasks. Just as
 with concurrency in general, my main motivation in implementing this is to
 improve performance. For more information about the disruptor approach,
 there's a lot of information linked from their home page, in a variety of
 formats.


 This is really exciting work. Thanks for pursuing it. I've been
 interested in exploring something like Disruptor in Rust. The current
 channel types in Rust are indeed slow, and fixing them is the topic of
 https://github.com/mozilla/rust/issues/8568.


 I'll start paying attention to that. The Morrison  Afek 2013 paper looks
 like something I should read.




 This is my first major contribution of new functionality to an
 open-source project, so I didn't want to discuss it in advance until I had
 a working system to demonstrate. I currently have a very basic proof of
 concept that achieves almost two orders of magnitude better performance
 than the pipes API. On my hardware[2], I currently see throughput of about
 27 million items per second when synchronizing with a double-checked wait
 condition protocol between sender and receivers, 80+ million items with no
 blocking (i.e. busy waiting), and anywhere from 240,000 to 600,000 when
 using pipes. The LMAX Disruptor library gets up to 110 million items per
 second on the same hardware (using busy waiting and yielding), so there's
 definitely still room for significant improvement.


 Those are awesome results!


 Thanks! When I first brought it up, it was getting about 14 million with
 the busy waiting. Minimizing the number of atomic operations (even with
 relaxed memory ordering) makes a big difference in performance. The 2/3
 drop in performance with the blocking wait strategy comes from merely doing
 a read-modify-write operation on every send (it currently uses atomic swap,
 I haven't experimented with others yet). To be fair, the only result I can
 take credit for is the blocking algorithm. The other ideas are straight
 from the original disruptor.


 I've put the code up on GitHub (I'm using rustc from master).[3]
 Currently, single and multi-stage pipelines of receivers are supported,
 while many features are missing, like multiple concurrent senders, multiple
 concurrent receivers, or mutation of the items as they pass through the
 pipeline. However, given what I have so far, now is probably the right time
 to start soliciting feedback and advice. I'm looking for review,
 suggestions/constructive criticism, and guidance about contributing this to
 the Rust codebase.


 I'm not deeply familiar with Disruptor, but I believe that it uses
 bounded queues. My general feeling thus far is that, as the general 'go-to'
 channel type, people should not be using bounded queues that block the

Re: [rust-dev] Stack management in SpiderMonkey or aborting on stack overflow could be OK.

2013-10-30 Thread Matthieu Monrocq

I really like the idea of a task being a sandbox (if pure/no-unsafe Rust).

It seems (relatively) easy for a task to keep count of the number of bytes
it allocated (or the number of blocks), both heap-allocated and
stack-allocated blocks could be meshed together there (after all, both
consume memory), and this single count would address both (1) and (3) at
once.

Regarding the intrinsic to extend the stack, it seems nice, and in fact
generalizable. It looks to me like a coroutine in the same memory space,
compared to a task being a coroutine in a different memory space. Maybe
some unification is possible here ?

-- Matthieu


On Wed, Oct 30, 2013 at 3:17 AM, Niko Matsakis n...@alum.mit.edu wrote:

 I certainly like the idea of exposing a low stack check to the user
 so that they can do better recovery. I also like the idea of
 `call_with_new_stack`. I am not sure if this means that the default
 recovery should be *abort* vs *task failure* (which is already fairly
 drastic).

 But I guess it is a legitimate question: to what extent should we
 permit safe rust code to bring a system to its knees? We can't truly
 execute untrusted code, since it could invoke native things or include
 unsafe blocks, but it'd be nice if we could give some guarantees as to
 the limits of what safe code can do. Put differently, it'd be nice
 if tasks could serve as an effective sandbox for *safe code*.

 It seems to me that the main way that safe code can cause problems for
 a larger system are (1) allocating too much heap; (2) looping
 infinitely; and (3) over-recursing. But no doubt there are more.
 Maybe it doesn't make sense to address only one problem and not the
 others; on the other hand, we should not let the perfect be the enemy
 of the good, and perhaps we can find ways to address the others as
 well (e.g., hard limits on total memory a task can ever allocate;
 leveraging different O/S threads for pre-emption and killing, etc).


 Niko


 On Tue, Oct 29, 2013 at 11:51:10PM +0100, Igor Bukanov wrote:
  SpiderMonkey uses recursive algorithms in quite a few places. As the
  level of recursion is at mercy of JS code, checking for stack
  exhaustion is a must. For that the code explicitly compare an address
  of a local variable with a limit set as a part of thread
  initialization. If the limit is breached, the code either reports
  failure to the caller (parser, interpreter, JITed code) or tries to
  recover using a different algorithm (marking phase of GC).
 
  This explicit strategy allowed to archive stack safety with relatively
  infrequent stack checks compared with the total number of function
  calls in the code. Granted, without statick analysis this is fragile
  as missing stack check on a code path that is under control of JS
  could be potentially exploitable (this is C++ code after all), but it
  has being working.
 
  So I think aborting on stack overflow in Rust should be OK as it
  removes security implications from a stack overflow bugs. However, it
  is a must then to provide facilities to check for a low stack. It
  would also be very useful to have an option to call code with a newly
  allocated stack of the given size without creating any extra thread
  etc. This would allow for a pattern like:
 
  fn part_of_recursive_parser ... {
 if stack_low() {
call_with_new_stack(10*1024*1024, part_of_recursive_parser)
 }
  }
 
  Then missing stack_low() becomes just a bug without security
 implications.
  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Fwd: Faster communication between tasks

2013-10-30 Thread Matthieu Monrocq

If I may suggest, rather than blocking the sender in case the channel is
full, I suggest simply returning an error (or raising a condition)
immediately.

This is both extremely simple (for the channel implementer) and heavily
customizable (for the user).

It seems certainly much easier than provide an extremely wide array of
different channels as part of the core Rust distribution... and actually
makes it possible to build libraries for common cases (such as local
queuing).

-- Matthieu.

On Wed, Oct 30, 2013 at 6:37 AM, Ben Kloosterman bkloo...@gmail.com wrote:

Simon 1 thing you may want to test is 10-20 senders to 1 reciever.
Multiple senders have completely diffirent behaviour and can create a lot
of contention around locks / interlocked calls . Also checks what happens
to CPU when the receiver blocks for 100ms disk accesses every 100ms.

Disruptor as used by Lmax normally uses very few senders / receivers and
the main/busy threads do no IO.

Ben

On Wed, Oct 30, 2013 at 1:03 PM, Simon Ruggier simo...@gmail.com wrote:

See my first message, I tested the throughput of the pipes API, it is far
slower. Synchronization between sender and receiver depends on which wait
strategy is used. There is a strategy that blocks indefinitely if no new
items are sent. To see how it works, look at this comment:

https://github.com/sruggier/rust-disruptor/blob/7cbc2fababa087d0bc116a8a739cbb759354388b/disruptor.rs#L762

Multiple senders are also on my roadmap.

Some things just aren't testable, because the memory ordering guarantees
depend on the hardware you're running on. For it to be truly correct and
portable, the source code has to be simple enough for a reviewer to able to
verify correctness at compile time. The comment I link to above is a good
example, I could never test that code thoroughly enough to be satisfied, a
proof of correctness is the only way.

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] On Stack Safety

2013-10-24 Thread Matthieu Monrocq

On Thu, Oct 24, 2013 at 4:18 PM, Benjamin Striegel
ben.strie...@gmail.comwrote:

  you do compete with Go (4 kB initial stack segment) and Erlang (2.4 kB
 on 64 bit).

 Actually, goroutines have a default stack size of 8kb since 1.2.

 Also, applicable to this discussion, in 1.3 Go will be moving away from
 segmented stacks to contiguous growable stacks:
 https://docs.google.com/document/d/1wAaf1rYoM4S4gtnPh0zOlGzWtrZFQ5suE8qr2sD8uWQ/pub


This is an interesting move, however the pointer-to-the-stack looks really
hard to solve. In Rust, for example, I can store a reference to a stack
element in a vec for example, and it is undistinguished (in the type
system) from a pointer to an element not on the stack.

Also, I was surprised at: When that call returns, the new stack chunk is
freed., looks like they were not keeping the next chunk around. Indeed
this could generate a lot of allocation traffic.

-- Matthieu



 On Tue, Oct 22, 2013 at 12:52 AM, Patrick Walton pwal...@mozilla.comwrote:

 On 10/21/13 8:48 PM, Daniel Micay wrote:

 Segmented stacks result in extra code being added to every function,
 loss of memory locality, high overhead for calls into C and
 unpredictable performance hits due to segment thrashing.

 They do seem important for making the paradigm of one task per
 connection viable for servers, but it's hard to balance that with other
 needs.


 I'm not sure they're that important even for that use case. Is 4 kB (page
 size) per connection that bad? You won't compete with nginx's memory usage
 (2.5 MB for 10,000 connections, compared to 40 MB for the same with 4 kB
 stacks), but you do compete with Go (4 kB initial stack segment) and Erlang
 (2.4 kB on 64 bit).

 Besides, if we really wanted to go head-to-head with nginx we could
 introduce microthreads with very small stack limits (256 bytes or
 whatever) that just fail if you run off the end. Such a routine would be
 utterly miserable to program correctly but would be necessary if you want
 to compete with nginx in the task model anyhow :)

 Realistically, though, if you are writing an nginx killer you will want
 to use async I/O and avoid the task model, as even the overhead of context
 switching via userspace register save-and-restore is going to put you at a
 disadvantage. Given what I've seen of the nginx code you aren't going to
 beat it without counting every cycle.

 Patrick


 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Unified function/method call syntax and further simplification

2013-10-20 Thread Matthieu Monrocq

It seems to me that maybe there are several concepts/changes that are
discussed at once, and it would be possible to nitpick.

Personally, when I think of unifying calls, I only think of having
foo.bar(baz) being strictly equivalent to bar(foo, baz); nothing more
than a syntax trick in a way. And thus:

+ I do not see any reason not to keep a special associated method
look-up, though instead it would be tied to the first parameter of the
function rather than limited to method-like calls

+ I do not see any reason not to keep automatically exporting/importing all
methods whose first parameter is that of an exported/imported type or trait

+ I do not see any reason to move from explicit trait implementation to
structural and automatic trait implementation (and I would consider it
harmful)


Thus I am wondering:

- if I am missing something fundamental in the proposal by Gabor Lehel (I
am not completely accustomed with the Rust terminology/idioms)

- if such a simple syntax sugar could make its way into the language

-- Matthieu



On Sun, Oct 20, 2013 at 7:22 PM, Gábor Lehel illiss...@gmail.com wrote:

 On Sun, Oct 20, 2013 at 4:56 PM, Patrick Walton pwal...@mozilla.comwrote:

 I don't see the things you mention as warts. They're just consequences
 of, well, having methods in the OO sense. Nearly all of these warts show
 up in other object-oriented languages too. Maybe they're warts of
 object-oriented programming in general and illustrate that OO is a bad
 idea, but as I mentioned before Rust is designed to support OO.


 OO for me was always more tied in with virtual methods than with how
 methods are scoped. But either way - I think this is basically my view. :)
 The only part of it I like is dot syntax.

 --
 Your ship was destroyed in a monadic eruption.

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] c structs with bitfields

2013-09-08 Thread Matthieu Monrocq

Actually, for bitfields the types into which the bits are packed are not
left to the compiler. If you said int c : 4, then it will use a int.

If you have:

int a : 24;
int b : 24;
int c : 16;

and int is 32 bits on your platform, then a will be 24 somewhere in 32
bits, same thing for b, and c will be 16 bits somewhere in 32 bits; for a
single bit field cannot be split among several underlying integers.

Exactly where the bits lie within the type, though, is part of the ABI.

-- Matthieu


On Sun, Sep 8, 2013 at 4:31 PM, Corey Richardson co...@octayn.net wrote:

 On Sun, Sep 8, 2013 at 3:00 AM, Martin DeMello martindeme...@gmail.comwrote:

 I was looking at the bindgen bug for incorrect bitfield handling

 https://github.com/crabtw/rust-bindgen/issues/8

 but from a quick pass through the rust manual I can't figure out what
 the correct behaviour would be.

 What, for example, would the correct bindgen output for the following be:

 struct bit {
   int alpha : 12;
   int beta : 6;
   int gamma : 2;
 };


 You'll have to check what the various C compilers do with bitfields. I
 imagine they pack the bitfields into the smallest integer type that will
 contain them all. But, almost everything about bitfields is entirely
 implementation defined, so it's probably going to be difficult to come up
 with what to do correctly in any portable way.

 Once you actually figure out what to generate, though, methods for
 getting/setting the bitfields would probably be best.




 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Mozilla using Go

2013-09-01 Thread Matthieu Monrocq

In a practical manner, I would say that Go is production ready whilst Rust
still has some way to go (!). Rust 1.0 is approaching, but is not there
yet, and there are still syntax/semantic questions being examined and lots
of work on the runtime... not to mention the lack of libraries (compared to
Go) largely due to the language still not being finalized.

I believe Rust could supplant Go (I see nothing in Go that Rust cannot do)
and cast a much wider net, but first it has to mature.

-- Matthieu


On Sun, Sep 1, 2013 at 10:48 AM, John Mija jon...@proinbox.com wrote:

 Hi! I've seen that Mozilla has used Go to build Heka (
 https://github.com/mozilla-**services/hekahttps://github.com/mozilla-services/heka).
 And although Go was meant to build servers while Rust was meant to build
 concurrent applications, Rust is better engineered that Go (much safer,
 more modular, optional GC).

 Then, when is better intended use case of Rust respect to Go?
 I expect Rust to be the next language for desktop applications if it gains
 as much maturity as Go but I'm unsure respect to the server side.
 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Slow rustc startup on Windows

2013-08-30 Thread Matthieu Monrocq

Intriguing...

I googled a bit to check what this was about and found:

- pseudo-reloc.c, the part of mingw handling pseudo-relocations =
http://www.oschina.net/code/explore/mingw-runtime-3.18-1/pseudo-reloc.c
- the patch for pseudo-reloc v2 support =
http://permalink.gmane.org/gmane.comp.gnu.mingw.announce/1953

At a glance I would say the problem is more on mingw side, however there
might be something that can be done on rust side to mitigate or work-around
the issue.

-- Matthieu



On Thu, Aug 29, 2013 at 9:35 PM, Vadim vadi...@gmail.com wrote:

 ... is apparantly caused by pseudo-relocations 
 (#8859https://github.com/mozilla/rust/issues/8859).
 Does anybody here know anything about that?

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] cycle time, compile/test performance

2013-08-24 Thread Matthieu Monrocq

Most C/C++ projects require parallel make because they lack modules. I work
on medium-large projects in C++, for which we use Boost as well as about a
hundred custom middleware components. A simple source file of ~1000 lines
ends up generating a preprocessed file in the order of between 100,000
lines and 1,000,000 lines. Each and every TU. This is what makes them so
amenable to parallelization.

On the other hand, for languages with modules, a ~1000 lines file is a
~1000 lines file; it may depend on ~50 various other modules, but those
need not be reparsed each time (a serialized version of the produced
AST/ABT can be generated once and for all) and they can also be cached by
the compiler (which unlike typical C compilers processes several modules in
one pass).

As such, there is much less gain here.

On the other hand, I do agree with your command, it could possibly be
better (temporarily) to run LLVM to cleanup each and every module before
combining them into a single crate for optimization; however I feel that in
the long term this will be useless once codegen itself is reviewed so that
first and foremost a leaner IR is emitted that do not require so much
cleanup to start with.

-- Matthieu

On Fri, Aug 23, 2013 at 10:16 PM, Bill Myers bill_my...@outlook.com wrote:

- We essentially always do whole-program / link-time optimization
in C++ terminology. That is, we run a whole crate through LLVM at
once. Which _would_ be ok (I think!) if we weren't generating
quite so much code. It is an AOT/runtime trade but one we
consciously designed-in to the language.

time: 33.939 s LLVM passes

Maybe this should be changed to optionally do codegen and LLVM passes in
parallel, producing an LLVM or native module for each Rust file, and then
linking the modules together into the compiled crate.

Alternatively, there seems to be some work on running LLVM FunctionPasses
in parallel at
http://llvm.1065342.n5.nabble.com/LLVM-Dev-Discussion-Function-based-parallel-LLVM-backend-code-generation-td59384.htmlbut
it doesn't seem production-ready.

Most large C/C++ projects rely on parallel make and distcc to have
reasonable build times, and it seems something that Rust needs to support
(either via make/distcc or internally) to be a viable replacement for large
projects.

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Dynamic in Rust

2013-08-23 Thread Matthieu Monrocq

One question:

Do you only want to retrieve the exact type that was passed in, or would
you want to be able to extract an impl that matches the type actually
contained ?

The latter is more difficult to implement (dynamic_cast goes through hoops
to check those things), but it is doable if sufficient information is
encoded in the v-table.


On Fri, Aug 23, 2013 at 5:04 PM, Oren Ben-Kiki o...@ben-kiki.org wrote:

 Yes, this would be similar to the `Typeable` type class in Haskell. It
 queries the vtable-equivalent, which contains stuff like the name of the
 type and allows doing `typeof(x)`, dynamic casts, etc. This is heavily
 magical (that is, depends on the hidden internal representation) and
 properly belongs in the standard platform and not in a user-level library.


 On Fri, Aug 23, 2013 at 4:40 PM, Niko Matsakis n...@alum.mit.edu wrote:

 Currently, this is not directly supported, though downcasting in
 general is something we have contemplated as a feature.  It might be
 possible to create some kind of horrible hack based on objects. A
 trait like:

 trait Dynamic { }
 implT Dynamic for T { }

 would allow any value to be cast to an object. The type descriptor can
 then be extracted from the vtable of the object using some rather
 fragile unsafe code that will doubtless break when we change the
 vtable format. The real question is what you can do with the type
 descriptor; they are not canonicalized, after all. Still, it's
 ... very close.  This is basically how dynamic downcasting would work,
 in any case.


 Niko

 On Fri, Aug 23, 2013 at 07:49:57AM +0300, Oren Ben-Kiki wrote:
  Is it possible to implement something like Haskell's Dynamic value
 holder
  in Rust? (This would be similar to supporting C++'s dynamic_cast).
  Basically, something like this:
 
  pub struct Dynamic { ... }
  impl Dynamic {
  pub fn put(value: ~T) { ... }
  pub fn get() - OptionT { ... }
  }
 
  I guess this would require unsafe code... even so, it seems to me that
 Rust
  pointers don't carry sufficient meta-data for the above to work. A
 possible
  workaround would be something like:
 
  pub struct Dynamic { type_name: ~str, ... }
  impl Dynamic {
  pub fn put(type_name: str, value: ~T) { Dynamic { type_name:
  type_name, ... } }
  pub fn get('a self, type_name: str) - Option'a T {
  assert_eq!(type_name, self.type_name); ... } }
  }
 
  And placing the burden on the caller to always use the type name int
 when
  putting or getting `int` values, etc. This would still require some
 sort of
  unsafe code to cast the `~T` pointer into something and back, while
  ensuring that the storage for the `T` (whatever its size is) is not
  released until the `Dynamic` itself is.
 
  (Why do I need such a monstrosity? Well, I need it to define a
  `Configuration` container, which holds key/value pairs where whoever
 sets a
  value knows its type, whoever gets the value should ask for the same
 type,
  and the configuration can hold values of any type - not from a
 predefined
  list of types).
 
  Is such a thing possible, and if so, how?
 
  Thanks,
 
  Oren Ben-Kiki

  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Iterator blocks (yield)

2013-08-11 Thread Matthieu Monrocq

Hello,

I cannot comment on the difficulty of implementation, however I can only
join Armin in wishing that if it ever takes off it would be better to make
the declaration explicit rather than having to parse the definition of the
function to suddenly realize that this is not a simple function but a
full-blown generator.

Furthermore, in keeping with the iterator ongoing, I would obviously push
toward unifying the systems by having the generator implementing the
Iterator trait (or whatever its name).

-- Matthieu


On Sun, Aug 11, 2013 at 12:01 PM, Armin Ronacher 
armin.ronac...@active-4.com wrote:

 Hi,


 On 10/08/2013 14:23, Michael Woerister wrote:

 Hi everyone,
 I'm writing a series of blog posts about a possible *yield statement*
 for Rust. I just published the article that warrants some discussion and
 I'd really like to hear what you all think about the things therein:
 http://michaelwoerister.**github.io/2013/08/10/iterator-**
 blocks-features.htmlhttp://michaelwoerister.github.io/2013/08/10/iterator-blocks-features.html

 I have been toying around with the idea of yield for a bit, but I think
 there are quite a few big problems that need figuring out.

 The way yield return works in C# is that it rewrites the code into a
 state machine behind the scenes.  It essentially generates a helper class
 that encapsulates all the state.

 In Rust that's much harder to do due to the type system.  Imagine you are
 doing a yield from a generic hash map.  The code that does the rewriting
 would have to place the hash map itself on the helper struct that holds the
 state.  Which means that the person writing the generator would have to put
 that into the return value.

 I currently have a really hard time thinking about how the c# trick would
 work :-(


 Aside from this some random notes from Python:

 - generators go in both directions in Python which caused problems
   until Python 3.3 where yield from (your yield ..) was introduced
   that expands into a monstrosity that forwards generators into both
   directions.
 - instead of using fn like def in Python I would prefer if it was
   an explicit yield fn that indicates that the function generates an
   iterator.  The fact that Python reuses def is a source of lots of
   bugs and confusion.


 Regards,
 Armin

 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] RFC: Runtimeless libstd

2013-08-11 Thread Matthieu Monrocq

Hi Corey,

It's great to see that people are thinking more and more about integrating
Rust in existing languages!

I wonder however whether the other alternative has been envisioned: if Rust
requires a runtime to work properly (specifically: TLS, task failure),
would it be possible to provide an external caller the ability to setup the
runtime before calling Rust methods ?

I have absolutely no idea whether this is sensible or possible, but maybe
rather than either extreme (a full runtime setup vs a no runtime mode)
there is a way to meet in the middle; with  a core runtime that can be set
from a C interface (TLS ? ...) and then a set of cfgs for various
additional pieces (such as garbage collection ? ...).

-- Matthieu



On Sun, Aug 11, 2013 at 7:42 PM, Corey Richardson co...@octayn.net wrote:

 I've opened a pull request for basic runtimeless support on libstd:
 https://github.com/mozilla/rust/pull/8454

 I think it needs a wider discussion. I think it's very desirable to
 have a libstd that can be used without a runtime, especially once we
 have static linking and link-time DCE. As it stands, this patch is
 more of a hack. It removes swaths of libstd that currently can't work
 without a runtime, but adds some simple stub implementations of the
 free/malloc lang items that call into the libc, so really it requires
 a C runtime.

 What I think we should end up with is various levels of runtime.
 Some environments can provide unwinding, while others can't, for
 example. You can mix-and-match various cfgs for specific pieces of the
 runtime to get a libstd that can run on your platform. Other things
 require explicit language items (think zero.rs). Thankfully the
 compiler now errors when you use something that requires a language
 item you don't implement, so it's easy to see what you need and where.
 I envision a sort of platform file that implements language items
 for a specific platform, and you'd include this in the libstd build
 for the platform.

 But libstd, as it stands, is insanely dependant on a full, robust
 runtime, especially task failure and TLS. A runtimeless libstd can't
 depend on either of those. You can see the hack in str.rs to not use
 conditions when no_rt is given.

 While I don't think my PR should be merged as-is, I think the
 discussion for the best way to achieve what it accomplishes correctly
 is important.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] read_byte and sentinel values

2013-07-24 Thread Matthieu Monrocq

Given that all values of u8 are meanginful, there is no space for an extra
bit, so it is no surprise that it cannot be packed.

For pointers, for example, it is typical to exploit the fact that the null
pointer is a meaningless value and thus rely on this sentinel value to
encode the absence of value, but in general this is only possible if such a
sentinel value is possible to begin with.

-- Matthieu



On Wed, Jul 24, 2013 at 6:33 PM, Brendan Zabarauskas bjz...@yahoo.com.auwrote:

 On 25/07/2013, at 2:15 AM, Evan Martin mart...@danga.com wrote:

 Is an Optionu8 implemented as a pair of (type, value) or is it packed
 into a single word?


 A quick test shows:

 rusti std::sys::size_of::Optionu8()
 16

 ~Brendan

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] read_byte and sentinel values

2013-07-24 Thread Matthieu Monrocq

It could be.

If it is not, it may be that Option needs some love at the CodeGen level to
make it so :)

-- Matthieu.


On Wed, Jul 24, 2013 at 6:46 PM, Corey Richardson co...@octayn.net wrote:

 On Wed, Jul 24, 2013 at 12:42 PM, Matthieu Monrocq
 matthieu.monr...@gmail.com wrote:
  Given that all values of u8 are meanginful, there is no space for an
 extra
  bit, so it is no surprise that it cannot be packed.
 
  For pointers, for example, it is typical to exploit the fact that the
 null
  pointer is a meaningless value and thus rely on this sentinel value to
  encode the absence of value, but in general this is only possible if
 such a
  sentinel value is possible to begin with.
 

 I would expect it to be backed into a u16.

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Borrow lifetime assignment changed?

2013-07-07 Thread Matthieu Monrocq

On Sat, Jul 6, 2013 at 5:26 PM, Tommy M. McGuire mcgu...@crsr.net wrote:

 On 07/03/2013 09:53 PM, Ashish Myles wrote:
  hello.rs:4:8: 4:33 error: borrowed value does not live long enough

 I was just about to write asking about this. I discovered it with the
 following code:

   for sorted_keys(dict).iter().advance |key| { ... }

 The result of sorted_keys is a temporary vector, which doesn't seem to
 live long enough for the iterator. If I give the temporary a name,
 everything works as expected.


 --
 Tommy M. McGuire
 mcgu...@crsr.net
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


Interesting.

There is a specific rule in the C++ specification to address temporaries:
they should live up until the end of the full expression they are part of.

I suppose that to suppose this case Rust might need the same rule and then
determine that for  { } is a single expression.

It seems feasible (and maybe partly addressed already) however I cannot
help but point out that I regularly see issues related to this popping on
the Clang list and commits to fix it a bit more, apparently it's quite a
nest of vipers and has ripple effects on implementing pretty much any other
feature of the language.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Segmented stacks (was: IsRustSlimYet (IsRustFastYet v2))

2013-07-06 Thread Matthieu Monrocq

On Fri, Jul 5, 2013 at 11:07 PM, Daniel Micay danielmi...@gmail.com wrote:

 On Fri, Jul 5, 2013 at 4:58 PM, Bill Myers bill_my...@outlook.com wrote:
  I believe that instead of segmented stacks, the runtime should determine
 a
  tight upper bound for stack space for the a task's function, and only
  allocate a fixed stack of that size, falling back to a large C-sized
 stack
  if a bound cannot be determined.
 
  Such a bound can always be computed if there is no recursion, dynamic
  dispatch, dynamic allocation on the stack, or foreign C functions.
 

 In practice this means everything would use a large stack. It misses
 the use case of scaling up tasks to many I/O requests by trading off
 performance for small size.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


There was, at one point, a discussion on providing a
#[reserve_stack(2048)]  attribute for extern functions where the developer
would indicate to the runtime that said function would never need more than
N bytes of stack. It was deemed burdensome, and might be somewhat, however
I still believe that annotating some key C extern functions (such as those
performing IO) would allow computing this upper-bound in more cases. Of
course, the real experiment would be to instrument the compiler and see
exactly how many tasks can indeed be so bounded... and *why* the others
cannot; unfortunately it might take some time to get it working.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Tutorial translations

2013-07-04 Thread Matthieu Monrocq

On Thu, Jul 4, 2013 at 2:26 AM, Graydon Hoare gray...@mozilla.com wrote:

 On 13-07-03 05:06 PM, Tim Chevalier wrote:

  I don't know of any such proposal already, so I encourage you to take
  the lead. Of course, even with the translations in the tree, there's
  the risk that they could become out of sync with the English version,
  but that's preferable to not having translations at all. (Perhaps
  other people who have been in projects with internationalized
  documentation can comment on the best approach(es) to this issue?)

 I was hoping we'd set up a pootle server to translate .po files, and/or
 use the existing pootle instance mozilla runs:
 https://localize.mozilla.org/

 .po files aren't perfect, but they seem to be dominant in this space.
 There are a lot of tools to work with them, show the drift from a
 translation and its source, and reconstruct software and documentation
 artifacts from the result. I think po4a might be applicable to the .md
 files that hold our docs:
 http://po4a.alioth.debian.org/

 Someone who is familiar with these tools and workflows would be very
 welcome here. We've had a few people ask and just haven't got around to
 handling it yet.

 -Graydon
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


I thought that .po files were mostly used to translate bits and pieces,
such as strings used in GUIs, and not full-blown text files such as
tutorials ?

As to version drift, if both versions are in-tree it seems easy enough to
check were made on the English version after the last commit of the Spanish
version; you would just have to find the latest ancestor of both changesets
and get all changes to the English version that are not in the Spanish
branch.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] IsRustSlimYet (IsRustFastYet v2)

2013-07-04 Thread Matthieu Monrocq

On Thu, Jul 4, 2013 at 9:48 PM, Daniel Micay danielmi...@gmail.com wrote:

 On Thu, Jul 4, 2013 at 1:02 PM, Björn Steinbrink bstei...@gmail.com
 wrote:
  Hi,
 
  On 2013.07.05 02:02:59 +1000, Huon Wilson wrote:
  It looks like it's a lot more consistent than the original [IRFY],
  so it might actually be useful for identifying performance issues.
  (Speaking of performance issues, it takes extra::json ~1.8s to parse
  one of the 4 MB mem.json file; Python takes about 150ms; the `perf`
  output http://ix.io/6tV, a *lot* of time spent in allocations.)
 
  This is to a large part due to stack growth. A flamegraph that shows
  this can be found here:
 
  http://i.minus.com/1373041398/43t7zpBOcgy3CeDpkSht0w/inUqVLvZGEUfx.svg
 
  Setting RUST_MIN_STACK to 800 cuts runtime in half for me.
 
  Björn

 I find this is the case for many benchmarks. With segmented stacks,
 we're behind Java, and without them Rust can get close to C++.

 I think it should be part of the API in the task module, allowing
 segmented stacks to be used only when they make sense. The first task
 spawned by the scheduler can just have a large fixed stack.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


You are here assuming that one will not create many schedulers, which the
current design allows.

(Not necessarily a bad idea, per se, just wanted to point out a possible
new limitation)

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Language support for external iterators (new for loop)

2013-06-30 Thread Matthieu Monrocq

Hello,

Regarding type-erasure without stack-allocation I remember a question on
StackOverflow on how to implement this.

In C++ this can be done using templates = template typename T, size_t
Size, size_t Alignment class Pimpl;

Pimpl will then declare raw storage (char [] in C++03,
std::aligned_storageSize, Alignment::type in C++11), and then this space
will be used by `T` (which was forward declared).

An important (and maybe overlooked) aspect is that Size and Alignment are
upper-bounds. Whilst to avoid wasting space it is better they be as close
to the necessary value, equality is not necessary. And thus one could
perfectly imagine a AgnosticIteratorRandomIterator, 16, 8 and it is up to
the builder to create a type that fit... and maybe use dynamic allocation
as a fallback if the iterator state cannot fit within the provided size.

-- Matthieu.



On Sun, Jun 30, 2013 at 4:22 PM, james ja...@mansionfamily.plus.com wrote:

  On 29/06/2013 22:32, Daniel Micay wrote:

  On Sat, Jun 29, 2013 at 5:29 PM, james ja...@mansionfamily.plus.com 
 ja...@mansionfamily.plus.com wrote:

   On 29/06/2013 18:39, Niko Matsakis wrote: if you were going to store the 
 result on the caller's stack frame, the caller would have to know how much 
 space to allocate! Conceivably one If you can have a function that returns 
 an allocated iterator, can't you instead have a function that informs how 
 big it would be, and a function that uses a passed in pointer from alloca?

  We don't have alloca, but if we did, it would be less efficient than a
 statically sized allocation since it would involve an extra stack size
 check. A low-level, unsafe workaround like that isn't needed when you
 can just have a function return an iterator of a specific type.

  Well, if the caller knows the type of the returned object and it is
 returned by value - yes.

 But I thought the discussion had strayed to considering a case where the
 type is hidden
 inside the iterated-over object, so the caller using the pattern does not
 know how to
 allocate space for it and receive such an object by value.  I was trying
 to suggest that
 it is not necessary for the caller to know much about the iterator object
 to avoid a
 heap allocation - it has to ask the size, and it has to then allocate and
 pass some raw
 storage on its stack.  And I guess it has to ask for a function to call on
 the raw store
 when it has finished with it.

 I'm not claiming that this is more efficient that a return by value, just
 that it may be
 possible to avoid a heap allocation even in the case where the call site
 only sees an
 abstract iterator interface, and does not know any details.

 This is very much similar to tradeoffs in C++ between using inheritance
 and interfaces
 vs templates, and my experience has been that moving to templates has in
 some
 cases swapped a small runtime overhead for major problems with compilation
 speed
 and emitted code size - and the latter has caused runtime performance
 issues all of
 its own.


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

[rust-dev] On tunable Garbage Collection

2013-06-18 Thread Matthieu Monrocq

Hi,

I was reading with interest the proposal on library-defined garbage
collection (by use of dedicated types), and I have a couple of questions.

My main worry, and thus question, is how to handle cross-scheme cycles ?
There does not seem to be anything preventing me to have a RcObject
reference a GcObject (and vice-versa), and I am wondering how garbage
collectors are supposed to cooperate to realize that those may be dead
cycles to collect.

As such, I am wondering if despite being neat (and highly tunable) users
might not be satisfied with a simpler scheme. It seems to me that lifetime
already provide natural GC boundaries (they at least provide an upper-bound
of the lifetime of the object) and thus that maybe it may be more natural
to attach a gc to a lifetime (or set of lifetimes) rather than to a
particular object ?

I was thinking of something like:

#pragma gc ReferenceCountingGC
fn somefunction(s: String) - Int

Note that in the latter case Rust would retain the @ sigil to denote
garbage collected pointers but depending on where the object was allocated
the @ would not refer to the same garbage collector.

I have, obviously, no idea whether this would actually practical; and it
might not!

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] The future of iterators in Rust

2013-06-07 Thread Matthieu Monrocq

On Fri, Jun 7, 2013 at 7:05 AM, Daniel Micay danielmi...@gmail.com wrote:

On Fri, Jun 7, 2013 at 12:58 AM, Sebastian Sylvan
sebastian.syl...@gmail.com wrote:
The linked article contrasts them with the GoF-style iterators as well.

The Rust Iterator trait is similar to the one pass ranges (and possibly
forward ranges), but not double-ended ranges or random-access ranges.
It's
the *family* of range-based iterators that makes it flexible (e.g.
allowing
you to write an efficient in-place reverse without knowing the underlying
data structure, using a double-ended range).

See fig. 3:

http://www.informit.com/content/images/art_alexandrescu3_iterators/elementLinks/alexandrescu3_fig03.jpg

The extent to which you can have mutable iterators in Rust is pretty
small, because of the memory safety requirement. Iterators can't open
up a hole allowing multiple mutable references to the same object to
be obtained, so I don't think mutable bidirectional or random access
iterators are possible.

Forward iterators can't ever give you an alias because they're a
single pass over the container. It's an easy guarantee to provide.

Is it ?

In this case it would mean that you can only have one Forward Iterator in
scope (for a given container) at once too (ie, the forward iterator borrows
the container); otherwise you could have two distinct iterators pointing to
the same underlying element.

I certainly appreciate the ongoing debate anyway, it's great to see things
being exposed to light and openly discussed. I would like to contribute
with one point: partitioning.

Sometimes, you would like to partition a container, or point to one of its
elements.

For example, in C++, you have an overload of insert which takes an iterator
allowing you to point to the routine where the element you ask to insert is
likely to go and thus shaving off a couple comparisons (if you are right).
This requires pointing to a single element, to be contrasted with a range.

Another example would be partitioning, a partition of a slice can be
represented with two points: the two end-points of the slice and the point
of partition.

Both of those examples can be represented by ranges (or in C++ iterator)
though they do not themselves imply any iteration. My point, thus, is that
there might be a need for fingers inside a container that go beyond basic
iteration.

-- Matthieu

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Scheduler and I/O work items for the summer

2013-06-01 Thread Matthieu Monrocq

On Sat, Jun 1, 2013 at 3:43 PM, Thad Guidry thadgui...@gmail.com wrote:



 I know that Rust currently doesn't currently support this, but what if
 futures could use a custom allocator?   Then it could work like this:

 1. Futures use a custom free-list allocator for performance.


I don't see why Futures could not be allocated on the stack ?

Since Rust is move aware and has value types, it seems to me this should be
possible.

-- Matthieu


 2. The I/O request allocates new future object, registers uv event, then
 returns unique pointer to the future to its' caller.  However I/O manager
 retains internal reference to the future, so that it can be resolved once
 I/O completes.
 3. The future object also has a flag indicating that there's an
 outstanding I/O, so if caller drops the reference to it, it won't be
 returned to the free list until I/O completes.
 4. When I/O is complete, the future get resolved and all attached
 continuations are run.


 Vadim


 Brian,

 Vadim described the idea fairly well there with the meat of my idea being
 # 2.  I was just trying to describe the scenario that # 4 be able to happen
 only when all the registered event(s) happen (not just 1 blocking step but
 perhaps many blocking steps).

 I would not know where to start mocking something like that with Rust
 yet... still beginning.

 --
 -Thad
 http://www.freebase.com/view/en/thad_guidry

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Do we have shared ports?

2013-05-28 Thread Matthieu Monrocq

Hi,

I am not quite sure whether you are asking for a multi-cast feature (all
clients receive a copy of the message) or for a send-to-one-among feature
(in which one of the available client would pick up the message).

Could you elaborate ?

-- Matthieu


On Tue, May 28, 2013 at 11:45 AM, Alexander Stavonin
a.stavo...@gmail.comwrote:

 Hi!

 As I know, As I know, we have SharedChan mechanism which can be used for
 many clients - server communications. But how can I send response from
 server to many clients? This is not commonly used case, and looks like
 there is no such mechanism or I've missed something?

 Best regards,
 Alexander.

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Calling back into Rust from C code

2013-05-12 Thread Matthieu Monrocq

As the error implies, the function type that you try to pass as a callback
is incorrect.

The problem is that because the callback is called from C it ought to be
compatible with C, thus the extern bit.

Rather than defining an anonymous function, you need to write an extern
fn function (with a name), so that the function (at low-level) have a
compatible ABI with C.

-- Matthieu


On Sat, May 11, 2013 at 3:04 AM, Skirmantas Kligys 
skirmantas.kli...@gmail.com wrote:

 I am trying to write a native wrapper for

 https://github.com/pascalj/rust-expat

 (BTW, if there is a native Rust XML parser, I am interested to hear
 about it, did not find it).  I have trouble calling back into Rust
 from C code:

 fn set_element_handlers(parser: expat::XML_Parser, start_handler:
 fn(tag: str, attrs: [@str]), end_handler: fn(tag: str)) {
   let start_cb = |_user_data: *c_void, c_name: *c_char, _c_attrs:
 **c_char| {
 unsafe {
   let name = str::raw::from_c_str(c_name);
   start_handler(name, []);
 }
   };

   let end_cb = |_user_data: *c_void, c_name: *c_char| {
 unsafe {
   let name = str::raw::from_c_str(c_name);
   end_handler(name);
 }
   };

   expat::XML_SetElementHandler(parser, start_cb, end_cb);
 }

 This says that it saw fn... instead of expected extern fn for the
 second and third parameter.  Any ideas how to do this?

 Thanks.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Having zip() fail when the two iterators are not the same length

2013-05-06 Thread Matthieu Monrocq

On Mon, May 6, 2013 at 12:28 AM, Lindsey Kuper lind...@composition.alwrote:

 On Sun, May 5, 2013 at 6:17 PM, Andreas Rossberg rossb...@mpi-sws.org
 wrote:
  On May 5, 2013, at 23:54 , Lindsey Kuper lind...@composition.al wrote:
  On Sun, May 5, 2013 at 4:19 PM, Noam Yorav-Raphael noamr...@gmail.com
 wrote:
  I have a simple suggestion: the current implementation of zip()
 returns an
  iterator which stops whenever one of the two iterators it gets stop.
  I use zip() in python quite a bit. I always have a few lists, where
 the i'th
  value in each corresponds to the same thing. I use zip in python to
 iterate
  over a few of those lists in parallel.
 
  I think this is the usual use case. In this use case, when the two
 lists
  have a different length it means that I have a bug. it seems to me that
  Python's behavior, and current Rust behavior, is contrary to Errors
 should
  never pass silently from the zen of Python.
 
  What do you think of changing this, so that zip() will fail in such a
 case?
  Another iterator, say, zipcut can implement the current behavior if
  needed.
 
  For what it's worth, in Wikipedia's comparison of implementations of
  zip for various languages [0], none of them raise an error when the
  lists are different lengths; they all either stop with the shorter of
  the two lists, or fill in the missing values with a nil value.
 
  That may be coincidence, however, since the page lists only a handful of
 languages. As a counter example, OCaml, which calls it 'combine', throws.
 Standard ML even provides two variants, 'zip' and 'zipEq', the latter
 throwing. (And as an additional data point, nowhere in my SML code have I
 ever had a need for the non-throwing version.)

 Fair point.  Perhaps Rust should also provide both.  I like the SML names,
 too.

 Lindsey
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


In the name of preventing obvious mistakes I would strongly suggest
implementing the reverse logic from SML: it's best when the shortest name
provides the safe behavior and unsafe behaviors have more descriptive
names indicating in what they are unsafe. It forces people to consciously
choose the unsafe alternatives.

I would therefore propose:

- zip: only on collections of equal length
- zipcut: stop iteration as soon as the shortest collection is exhausted
- zipfill: fill the void (somehow: default value, OptionT, ...)

This way we have all 3 variants, with descriptive names for those 2 who
introduce specific behavior.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Re : Re: RFC: User-implementable format specifiers w/ compile-time checks

2013-05-04 Thread Matthieu Monrocq

On Sat, May 4, 2013 at 1:15 PM, Olivier Renaud o.ren...@gmx.fr wrote:

  Hi,
 
  2013/5/3 Graydon Hoare gray...@mozilla.com
 
   (Erm, it might also be worthwhile to consider message catalogues and
   locale-facets at this point; the two are closely related. We do not
 have a
   library page on that topic yet, but ought to. Or include it in the
 lib-fmt
   page.)
 
  If you are talking about gettext-like functionality, usually this and
  format strings are thought of as independent processing layers: format
  strings are translated as such and then fed to the formatting function.
  This brings some ramifications, as the order of parameters in the
  translated template can change, so the format syntax has to support
  positional parameters. But this also allows to account for data-derived
  context such as numeral cases, without complicating the printf-like
  functions too much.
  There are other difficulties with localizing formatted messages that are
  never systematically solved, for example, accounting for gender. In all,
 it
  looks like an interesting area for library research, beyond the basic
  stick this value pretty-printed into a string problem.
 
  Cheers,
Mikhail

 Gettext is indeed dependent on the fact that the format syntax allows
 positional parameters.

 I'd like to point out that gettext also makes use of a feature of the
 formatting function. Namely, the fact that it is not an error to call this
 function with more arguments than what the format string expects.

 In C, printf(%d, 1, 2) outputs 1. In Rust, fmt!(%d, 1, 2) is a
 compilation error.

 The use case for using this feature is briefly explained here
 http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms

 A simple example is that, given the string there are %d frogs, the
 translator may want to translate it to il n'y a aucune grenouille instead
 of il y a 0 grenouilles. In this case, the resulting function call would
 be printf(il n'y a aucune grenouille, 0), which is valid since the unused
 argument will be ignored.

 By the way, it occurs to me that fmt! requires a string literal as its
 first argument. How could a system like gettext, whose role is to
 substitute the format string at runtime, could work with fmt! ?
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


Maybe we are taking this a bit backward ?

I understand that things like gettext, at the moment, only substitute the
text; but that may be seen as mistake rather than a feature.

Instead, we could perfectly imagine a gettext-like equivalent that takes
both an original format string (to be translated) *and* its arguments and
then will use fmt! under the hood to produce a fully translated string to
be fed to the Writer instance.

Note that anyway for a proper translation gettext requires access to
certain elements; for example for correct plural forms to be picked (esp in
Polish).

With this out of the way, not only are positional arguments not required
any longer, but we can also avoid ignoring a mismatch between the number of
arguments supplied (and their types) and those expected by the original
format-string. There is no point in being as loose as C.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] No range integer type? Saftey beyond memory?

2013-04-29 Thread Matthieu Monrocq

On Mon, Apr 29, 2013 at 6:33 PM, Jack Moffitt j...@metajack.im wrote:

  As was pointed out earlier with Mozilla source code, integer overflows
  do not happen. Probably because, in security-conscious code, you are
  supposed to validate your inputs for your actual expected range, and
  when you do, built-in overflow checks are just unnecessary overhead.

 If you're referring to Robert's comments, then I read them exactly the
 opposite way. He did mention that overflow to BigInts wasn't needed,
 but he is on the wants checked math side.

 I agree that this is a tradeoff, and that there is probably some
 performance loss at which it doesn't make sense. Until we have data on
 how expensive such a feature is, we can't make much progress in that
 particular debate. I just wanted to note my preference for having it
 default to on if it didn't cost too much, whatever cost too much
 might mean :)

 jack.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


It might be interesting to check how Clang integrated UBSan and its
performance implications. I know there was some work using cold functions
and expect hints to teach LLVM than the undefined branch (and callback)
were to be very rare so that it could optimize the code taking them out of
the hot path.

You can check a blog article on usage of UBSan here [1], and rebound to the
User's Manual from there; might be interesting to benchmark the code
produced by Clang with/without integer overflow detection (and just that,
UBSan include many other validations) to see what LLVM can do with it.

 [1]: http://blog.llvm.org/2013/04/testing-libc-with-fsanitizeundefined.html

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Division and modulo for signed numbers

2013-04-25 Thread Matthieu Monrocq

I was thinking about the mapping of / and % and indeed maybe the simplest
option is not to map them.

Of course, having an infix syntax would make things easier: 5 % 3 vs 5 rem
3 vs 5.rem(3), in increasing order of typed keys (and visual noise for the
latter ?).

On the other hand, if there is no mapping I can imagine people keeping
asking whether to use mod or rem...

-- Matthieu

On Thu, Apr 25, 2013 at 6:25 PM, Graydon Hoare gray...@mozilla.com wrote:

On 13-04-25 07:52 AM, Diggory Hardy wrote:

My opinion (that nobody will follow, but I still give it) is that integers
should not have the / operator at all. This was one of the bad choices
of
C (or maybe of a previous language).

Hmm, maybe, though I can imagine plenty of people being surprised at that.

What really gets me though is that % is commonly called the mod
operator and
yet has nothing to do with modular arithmatic (I actually wrote a blog
post
about it a few months back: [1]). If it were my choice I'd either make x
% y
do real modular arithmatic (possibly even throwing if y is not positive)
or
have no % operator (just mod and rem keywords).

While it's true that people often pronounce % as mod, the fact is most
of the languages in the lineage we're looking at treat it as rem.

http://en.wikipedia.org/wiki/**Modulo_operationhttp://en.wikipedia.org/wiki/Modulo_operation

50 languages in that list expose 'remainder' and 19 of them map it to '%'.
As well, as a systems language, it _is_ salient that the instructions on
the CPUs we're targeting and the code generator IR for said machines (LLVM)
expose a remainder operation, not a modulo one. Of the 35 languages that
expose _anything_ that does proper mod, only interpreted/script languages
(TCL, Perl, Python, Ruby, Lua, Rexx, Pike and Dart) call it %. That's not
our family. I'm sorry; if we're arguing over what the % symbol means, it
means remainder in our language family (the one including C, C++, C#, D,
Go, F#, Java, Scala).

(more gruesome comparisons available here: http://rigaux.org/language-**
study/syntax-across-languages/**Mthmt.html#MthmtcDBQAMhttp://rigaux.org/language-study/syntax-across-languages/Mthmt.html#MthmtcDBQAM)

There are other questions to answer in this thread. We had a complex set
of conversations yesterday on IRC concerning exposure of multiple named
methods for the other variants -- ceiling, floor and truncating division,
in particular. We may need to expose all 3, and it might be the case that
calling any of them 'quot' is just misleading; it's not clear to me yet
whether there's a consistent method _name_ to assign '/' to (floating point
divide seems to do the opposite of integer divide on chips that have both).

But I don't think it's wise to map % to 'mod' if we're exposing both 'mod'
and 'rem'. That's a separate issue and one with (I think) a simpler answer
for us.

-Graydon

__**_
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] LL(1) problems

2013-04-25 Thread Matthieu Monrocq

On Thu, Apr 25, 2013 at 6:53 PM, Patrick Walton pwal...@mozilla.com wrote:

 On 4/25/13 9:23 AM, Felix S. Klock II wrote:

 On 25/04/2013 18:12, Graydon Hoare wrote:

 I've been relatively insistent on LL(1) since it is a nice
 intersection-of-inputs, practically guaranteed to parse under any
 framework we retarget it to.

 I'm a fan of this choice too, if only because the simplest efficient
 parser-generators and/or parser-composition methodologies I know of take
 an LL(1) grammar as input.

 However, Paul's earlier plea on this thread (Please don't do this
 [grammar factoring] to the official parser!) raised the following
 question in my mind:

 Are we allowing for the possibility of choosing the semi-middle ground
 of: There *exists* an LL(1) grammar for Rust that is derivable from the
 non-LL(1)-but-official grammar for Rust. ?  Or do we want to go all the
 way to ensuring that our own grammar that we e.g. use for defining the
 syntactic classes of the macro system etc is strictly LL(1) (or perhaps
 LL(k) for some small but known fixed k)?


 I'm not sure we can do the latter. There are too many issues relating to
 `unsafe`, `loop`, the `self` argument, etc. to make the LL(1) derivable
 from the human-readable grammar in an automated fashion, in my eyes. At
 least, I'm pretty sure that if we did want to go down that route, we'd
 probably be doing months of parser research (and I do mean *research*, as
 far as I know).

 Patrick


 On the other hand, should you content yourself with LL(2), and actually
have a tool like yapp2 guarantee that it is indeed LL(2) (and does not
degenerate), would it not be sufficient ?

(in case LL(1) really is gruesome compared to LL(2))

-- Matthieu


 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] About a protected visibility modifier

2013-04-17 Thread Matthieu Monrocq

On Wed, Apr 17, 2013 at 9:24 AM, Eddy Cizeron eddycize...@gmail.com wrote:


 2013/4/16 Brandon Mintern bran...@mintern.net

 I agree with you completely, Matthieu, but that's not the kind of thing I
 had in mind. Consider a LinkedList implementation. Its nodes and their
 pointers would be private to the LinkedList, but when implementing an
 Iterator for that list, you would need to use that private information to
 avoid n^2 performance.


 That's a typical case I had in mind.


I am not sure.

Whilst the head of the list will be private to the list, there is no reason
that the *node type* be private. Expose the node, built iterators that take
a reference to a node in their constructor, and have the list build the
begin/end iterators (I guess). All iterations can be done safely...

Of course, it does mean that you have an issue whenever you wish to erase
an item by passing its position (unless you use unsafe code to make the
reference mutable again).


But that is, I think, a wrong example. Iterators are unsafe. You can easily
keep dangling iterators aside and having them blow up in your hands.


On the other, if we shunt external iterators and implement iteration with a
foreach method accepting a predicate, then we do not need to expose the
list internals. Give the predicate the ability to influence the list it
is running on (returning an enum Stop/Remove/...) and you are on.


I am not saying that there is absolutely no reason it will ever be needed,
but I am challenging the needs exposed so far :)

-- Matthieu




  And then I thought about it a little more and realized that this is
 precisely something that's unsafe. Most of my protected fields and methods
 in Java are accompanied by comments like, Important note: don't frobnicate
 a foo without also twiddling a bar.

 I think you're right, Daniel, that having a layer between public API
 and present implementation is probably not worth the cognitive overhead.


 I understand your point Brandon. But I could say sometime protected
 information is not so sensitive. When it is immutable for example. So why
 not declaring it public? Not to pollute the public interface with data
 unrelated with the common use (Yes I agree the argument is not very strong)


  It seems like it would be Rust best practice when implementing an
 interface to use the public API of the type to the extent possible, and
 when necessary for efficiency or other reasons, to use unsafe and fiddle
 with the private API.


 It could be. But if I'm allowed to play the Devil's advocate this implies
 that any implentation detail must be thought as potentially accessible (and
 then as necessarily understandable / documented) from outside (what you
 call private API). This is not the typical approach when considering
 private data.

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] About a protected visibility modifier

2013-04-16 Thread Matthieu Monrocq

I would also had that before jumping straight to direct access, one should
carefully measure. With inlining and constant propagation I would not be
surprised if the optimizer could not turn an access via the public API into
as efficient code as a direct access to the field in the trait method
implementation.

And if you really need another access method, then maybe it should be added
to the type directly ?


On Tue, Apr 16, 2013 at 6:23 PM, Brandon Mintern bran...@mintern.netwrote:

 I was about to write how I can understand the use case. That often for
 efficiency (runtime, memory, or concision) reasons, it's helpful to program
 to an API other than the public one. That in the process of implementing an
 interface on some existing type, it often needs to be reworked a bit to
 make it more flexible and extensible. That not having protected might
 result in a lot of unsafe declarations sprinkled around.

 And then I thought about it a little more and realized that this is
 precisely something that's unsafe. Most of my protected fields and methods
 in Java are accompanied by comments like, Important note: don't frobnicate
 a foo without also twiddling a bar.

 I think you're right, Daniel, that having a layer between public API and
 present implementation is probably not worth the cognitive overhead. It
 seems like it would be Rust best practice when implementing an interface to
 use the public API of the type to the extent possible, and when necessary
 for efficiency or other reasons, to use unsafe and fiddle with the private
 API.


 On Tue, Apr 16, 2013 at 3:08 AM, Daniel Micay danielmi...@gmail.comwrote:

 On Tue, Apr 16, 2013 at 5:53 AM, Eddy Cizeron eddycize...@gmail.com
 wrote:
 
  Hi everyone
 
  I was thinking: wouldn't it be useful if rust also had a protected
  visibility modifier for struct fields with the following meaning:
  A protected field in a structure type T is accessible wherever a
 private one
  would be as well as in any implementation of a trait for type T.
 
  Just an idea.
 
  --
  Eddy Cizeron

 What use case do you have in mind for using a protected field instead
 of a public one?

 The use case for a private field is separating implementation details
 from the external API and upholding invariants. Although it's *possible*
 to safely access them in an external module by using an unsafe block,
 if you took into account all of the implementation details and
 invariants of the type.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Library Safety

2013-04-03 Thread Matthieu Monrocq

Well, a full effect system might not be necessary just for safe plugins.

Since we have a way in Rust to indicate which version a plugin we want to
link to, we could apply some restrictions there.

For example, specifying that only a certain list of other libraries can be
used by this plugin (and typically none with IO features) and that it
cannot use unsafe code would guarantee some sandboxing already. Similarly,
the GC restriction could be placed there.

-- Matthieu


On Wed, Apr 3, 2013 at 5:26 PM, Dean Thompson deansherthomp...@gmail.comwrote:

 The Rust team refers to this as an effect system.  They originally had
 one, but that one proved unworkable and was deleted.  They continue to
 regard it as desirable but difficult to get right, and as a potential
 future.  Here's some 
 historyhttp://irclog.gr/#search/irc.mozilla.org/rust/%22effect%20system%22.
 They would certainly welcome serious proposals or demos, although almost
 certainly continuing to hold it out for post-1.0. They would think in terms
 of first researching the most successful effect systems in other languages.

 Dean

 From: Grant Husbands rust-...@grant.x43.net
 Date: Wednesday, April 3, 2013 5:14 AM
 To: rust-dev@mozilla.org
 Subject: [rust-dev] Library Safety

 I've been following the Rust story with some interest and I'm excited
 about the opportunities Rust brings for sandbox-free, secure system
 software. However, there are some things that it lacks, that would
 otherwise make it the obvious choice.

 One that I feel is important that has been touched upon by others is
 having static assurances about code, especially imported libraries. If I
 use a jpg library, I want to be sure that it isn't going to do be able to
 do any unsafe operations, use GC or access the file-system or the network.
 That way, I don't have to trust the code and can instead be assured that it
 simply cannot perform any dangerous actions.

 Currently, to do that, I have to inspect the whole library. As a developer
 without the time to do that, I'd much prefer for the import to be annotated
 to indicate such things (or, ideally, to be annotated to indicate the
 allowed dangers).

 This could be seen, of course, as a precursor to capabilities - reducing
 ambient authority is a key first step in getting a capability-secure system
 - but it's also a simple way of getting assurances about code without
 having to inspect it.

 Does it seem like a reasonable thing to add? I may be able to find time to
 work on it, should it be acceptable.

 Regards,
 Grant Husbands.
 ___ Rust-dev mailing list
 Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev

 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter

2013-01-12 Thread Matthieu Monrocq

On Sat, Jan 12, 2013 at 3:21 AM, James Gao gaoz...@gmail.com wrote:

 and how about these two case:

 a) fn fooT1: Ord, Eq, Hash; T2: Ord, ::Eq (...) {...}

 b) fn fooT1: Ord + Eq + Hash, T2: Ord + ::Eq (...) {...}


Really likes b), + looks especially suiting since we are adding up
requirements.

-- Matthieu



 On Sat, Jan 12, 2013 at 6:27 AM, Gareth Smith garethdanielsm...@gmail.com
  wrote:

 On 11/01/13 18:50, Niko Matsakis wrote:


 fn fooT:Eq(..) {...} // One bound is the same
 fn fooT:(Ord, Eq, Hash)(...) {...} // Multiple bounds require
 parentheses


 How about using { ... } rather than ( ... ), like imports:

 use xxx::{a, b, c};

 fn fooT:{Ord, Eq, Hash}(...) { ... }

 I don't know that this is better but maybe it is worth considering?

 Gareth.

 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev



 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Is necessary the manual memory management?

2012-10-29 Thread Matthieu Monrocq

On Sun, Oct 28, 2012 at 8:48 PM, Niko Matsakis n...@alum.mit.edu wrote:

 Regardless of whether manual memory management is desirable as an end
 goal, support for it is essentially required if you wish to permit tasks to
 exchange ownership of data without copies.  For example, in Servo we have a
 double-buffered system where mutable memory buffers are exchanged between
 the various parts of the system.  In order to make this safe, we have to
 guarantee that this buffer is unaliased at the point where it is sent—if
 you know it's unaliased, of course, you also know that you could safely
 free it.

 As a broader point, it turns out there are a LOT of type system things you
 can do if you know something about aliasing (or the lack thereof). Our
 current approach to freezing data structures for example is reliant on
 this.  Safe array splitting for data parallelism—if we ever go in that
 direction—will be reliant on this.  And so forth.  So, supporting a
 unique-pointer-like construct makes a lot of sense.


 Niko


I really think this is the core point: unique/borrowed/shared are less
about memory management than about ownership semantics. It would be
perfectly viable (albeit slower) to treat them all identically in terms of
codegen (ie, GC'em all). On the other hand, ownership semantics provide
both the developer and the compiler with *guarantees* upon which they can
build.

-- Matthieu

   John Mija jon...@proinbox.com
  October 28, 2012 4:55 AM
 Does make sense to have a language with manual memory management since
 it's possible to create stuff of low level with a specialized garbage
 collector?
 It's good to create drivers, but it's already C.


 i.e. Both Native Oberon and Blue Bottle operating systems were implemented
 in languages, Oberon and Active Oberon, which have type safety and
 concurrent (at least in compiler ulm oberon) GC support.

 http://www.inf.ethz.ch/personal/wirth/books/ProjectOberon.pdf


 Then, Java is being used in embedded and real-time systems with a
 deterministic garbage collection:

 http://www.pr.com/press-release/226895
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


compose-unknown-contact.jpg___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Object orientation without polymorphism

2012-10-23 Thread Matthieu Monrocq

On Tue, Oct 23, 2012 at 2:46 PM, Julien Blanc wh...@tgcm.eu wrote:

 Lucian Branescu a écrit :
  Something like this
 
 http://pcwalton.github.com/blog/2012/08/08/a-gentle-introduction-to-traits-in-rust/

 Very nice introduction. The only question that arises for me (coming from
 c++ ground and comparing this to c++ templates) is why trait
 implementation is made explicit ?

 Is it a design decision or a current compiler limitation ? I guess the
 compiler could not too difficultly be made smart enough to determine from
 its actual interface if a type conforms to a trait. Code generation may be
 more a problem, though…


It is actually a design decision, quite similar to how typeclass in Haskell
require explicit instantiation whereas Go's interfaces, like C++ templates,
do not.

Automatic detection is also called duck typing: it if quacks like a duck,
then it's a duck. There are two main disadvantages:

 - functionally, it means that you can use an object for something it has
never really been meant for = just because the signatures of some
functions match does not mean that their semantics match too
 - in terms of codegen, this might imply bloat (C++) or runtime overhead
(Go)

On the other hand, Haskell's approach is quite practical as long as one
solves the coherence issue.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] condition handling

2012-10-20 Thread Matthieu Monrocq

On Sat, Oct 20, 2012 at 11:16 AM, James Boyden j...@jboy.me wrote:

 On Sat, Oct 20, 2012 at 10:48 AM, Graydon Hoare gray...@mozilla.com
 wrote:

  Some references to the lurking plan here:
  https://mail.mozilla.org/pipermail/rust-dev/2011-November/000999.html

 Firstly, I'd like to express my appreciation for the clear reasoning
 in this linked post.

 I found the arguments clear and compelling, matching my own experience
 -- especially the enumeration of the small set of realistic uses of
 exception handling (ignore, retry, hard-fail, log, or try one of a
 small number of alternatives to achieve the desired result).

- Condition.raise is a normal function and does something very simple:
  - look in TLS to see if there's a handler
  - if so, call it and return the result to the raiser
  - if not, fail
 
- This means condition-handling happens _at site_ of raising. If
  the handler returns a useful value, processing continues as if
  nothing went wrong. It's _just_ a rendezvous mechanism for an
  outer frame to dynamically provide a handler closure to an inner
  frame where a condition occurs.

 This all seems reasonable after reading that post.

  So: API survey. Modulo a few residual bugs on consts and function
  regions (currently hacked around in the implementation), I have 3
  different APIs that all seem to work -- which I've given different names
  for discussion sake -- and I'm trying to decide between them. They
  mostly differ in the number of pieces and the location and shape of the
  boilerplate a user of the condition system has to write. My current
  preference is #3 but I'd like a show of hands on others' preferences.
 snip
  Opinions? Clarifying questions?

 I prefer option #3.

 I like that it states the condition and handler up-front (in contrast
 to most exception-handling syntaxes, that leave you guessing what
 might go wrong until you reach the catch statements at the end of the
 try/catch block).

 I like that the error-handling code is implicitly not an afterthought.
 (I find that when I'm writing the code on the main path, there's a
 strong temptation to just keep on coding, and come back to insert
 the error-handling code later.)

 I like that it has the minimum of boilerplate code (in contrast to
 option #1 especially).

 In addition to the boilerplate, I don't really like option #1 because
 of the separation of the protected block from the handler code.

 My concern with option #2 is that, despite my general fondness for
 RAII, the '_g' variable isn't explicitly used anywhere, so creating a
 named variable seems redundant.

 Plus, having an object that can reach out of its variable and affect
 all the code that follows in its block scope is too magical for my
 liking.

 jb
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


This seems heavily influenced by Lisp's Conditions  Restarts[1] (not that
I have used Lisp before though, but I *am* interesting in error handling
strategies), however it seems relatively unclear to me how the syntax work.
If I understood correctly there are 3 steps:

 - declare the condition
 - setup the handler for that condition (with the poll going on)
 - raise a signal for that condition

At the moment you only show the first 2, and it's unclear to me exactly how
the handler is *used* at the call site (looks to me like a regular function
call passing an instance of T as argument and getting a U in exchange). I
suppose it would something like:

let u = core::condition::signal(OutOfKitten, t)


One of the point I find difficult about this (and which is as difficult
with exceptions, short of going the path of the damned with exceptions
specification) is that it might become difficult to know exactly which
handlers to setup; as such this:

  - Condition.raise is a normal function and does something very simple:
- look in TLS to see if there's a handler
- if so, call it and return the result to the raiser
- if not, fail

might be a little forceful.


As such, I would have a tiny request:

= Would it make sense to just make it a point of customization but have
a way to specify a default result in case there is no handler, or even
specify a default handler ?

This means 2 or 3 different flavors of raising a condition, which adds to
the complexity of the language though. On the other hand, using the syntax
I used above, it's just:

let u = core::condition::signal(OutOfKitten, t) // default behavior, ie
fail if no handler


let u = core::condition::signal(OutOfKitten, t, |t| { fail }) //
default behavior made explicit

let u = core::condition::signal(OutOfKitten, t, |t| { if t == 0 then
fail else 3 }) // custom handler, if none setup

let u = core::condition::signal(OutOfKitten, t, 4) // simple way to
pass a default return value without actually having to write up a lambda,
ala |t| { 4 }  (just to avoid

Re: [rust-dev] condition handling

2012-10-20 Thread Matthieu Monrocq

On Sat, Oct 20, 2012 at 1:37 PM, Gareth Smith
garethdanielsm...@gmail.comwrote:

 Option 3 looks prettiest to me, but I like that in Option 1 the
 error-raising code comes before the error-handling code (based on
 experience with other languages). This might not be an issue in practice.

 I am not sure how I like Option 3 with a more complex trap block:

   OutOfKittens.trap(|t| {
   OrderFailure.trap(|t| notify_support()).in {
   order_more_kittens();
   }
   UseAardvarksInstead
   }).in {
   do_some_stuff();
   that_might_raise();
   out_of_kittens();
   }

 Compare this to some ideal syntax:

   protect {
   do_some_stuff();
   that_might_raise();
   out_of_kittens();
   } handle OutOfKittens(t) {
   protect {
   order_more_kittens();
   } handle OrderFailure(t) {
   notify_support();
   }
   UseAardvarksInstead
   )

 Gareth

 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev


I just realized I had missed a point, which is somewhat similar to what
Gareth raised: composability.

Let us start with an example in a C++ like language:

UserPreferences loadUserPreferences(std::string const username) {
try {
return loadUserPreferencesFromJson(username);
} catch(FileNotFound const) {
if (username == toto) { throw; }
return loadUserPreferencesFromXml(username); // not so long ago
we used xml
}
}

The one thing here is throw;, which rethrows the current exception and
pass it up the handler chain.

Is there any plan to have this available in this Condition/Signal scheme ?
Ie, being able in a condition to defer the decision to the previous handler
that was setup for the very same condition ?

It could be as simple as acore::condition::signal(OutOfKittens, t)
from within the current handler block. Which basically means that during
its invocation the current handler is temporarily popped from the stack
of handlers (for that condition) and after its executes (if it does not
fail) is pushed back.

I even wonder if this could not become automatic, that is unless a
handler fails hard or returns a proper value, its predecessor is
automatically called. However I failed to see how this could be nicely
integrated in the syntax and wonder what the ratio of pass-up-the-chain
vs ignore-predecessors would be in practice.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Polymorphism default parameters in rust

2012-08-04 Thread Matthieu Monrocq

On Sat, Aug 4, 2012 at 6:53 PM, Patrick Walton pwal...@mozilla.com wrote:

 On 08/02/2012 12:51 PM, Emmanuel Surleau wrote:

 Hi,

 I'm new to rust, and I'm struggling to find an elegant way to work with
 default
 parameters.


 Generally we've been experimenting with method chaining to achieve things
 like default and named parameters in Rust. See the task builder API for an
 example:

 https://github.com/mozilla/**rust/blob/incoming/src/**libcore/task.rs#L197https://github.com/mozilla/rust/blob/incoming/src/libcore/task.rs#L197

 So I can see your use case being something like:

 let flag = Flag(verbose, Maximum verbosity).short_name(v);

 To implement this you'd write:

 struct Flag {
 name: str;
 desc: str;
 short_name: optionstr;
 max_count: uint;
 banner: optionstr;
 }

 // Constructor
 fn Flag(name: str, desc: str) - Flag {
 Flag {
 name: name,
 desc: desc,
 short_name: none,
 max_count: 1,
 banner: none
 }
 }

 impl Flag {
 fn short_name(self, short_name: str) - Flag {
 Flag { short_name: some(short_name) with self }
 }
 fn max_count(self, max_count: uint) - Flag {
 Flag { max_count: max_count with self }
 }
 fn banner(self, banner: str) - Flag {
 Flag { banner: some(banner) with self }
 }
 }

 (Note that this depends on the functional record update with syntax
 working for structs, which it doesn't yet.)

 If this style catches on it'd probably be nice to have a macro to generate
 the mutators (fn short_name, fn max_count, fn banner). Then instead of the
 impl { ... } above you'd write something like:

 make_setter!(Flag.short_name: optionstr, WrapOption);
 make_setter!(Flag.max_count: uint);
 make_setter!(Flag.banner: optionstr, WrapOption);

 (Assuming that WrapOption is a special flag to the macro to indicate that
 the value should automatically be wrapped in some).

 How does this sound?

 Patrick


Verbose.

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] replacing bind with lightweight closure syntax

2012-06-02 Thread Matthieu Monrocq

On Sat, Jun 2, 2012 at 9:12 PM, Niko Matsakis n...@alum.mit.edu wrote:

 Hello Rusters,

 I want to remove bind.  It's a pain to maintain and a significant
 complication in the language semantics (both problems caused by having two
 kinds of closures with subtle distinctions).  Nobody really objects to
 this, but the fact remains that bind serves a role that is otherwise
 unfilled: it provides a (relatively) lightweight closure syntax.  For
 example, I can write:

foo.map(bind some_func(_, 3))

 which is significantly less noisy than:

foo.map({|x| some_func(x, 3)})

 I previously tried to address this through the introduction of `_`
 expressions as a shorthand for closures. This proposal was eventually
 rejected because the scope of the closure was unclear.  I have an
 alternative I've been thinking about lately.

 The basic idea is to introduce a new closure expression which looks like:
 `|| expr`.  The expression `expr` may make use of `_` to indicate anonymous
 arguments.  The scope of the closure is basically greedy.  So `|| _ + _ *
 _` is unproblematic.  The double bars `||` should signal that this is a
 closure.

 Therefore, you could write the above example:

foo.map(|| some_func(_, 3))

 This also makes for a nice, lightweight thunk syntax.  So, a method like:

map.get_or_insert(key, || compute_initial_value(...))

 which would (presumably) return `key` if it is present in the map, but
 otherwise execute the thunk and insert the returned value.

 The same convention naturally extends to named parameters, for those cases
 where anonymous parameters do not work.  For example, if you wish to
 reference the parameter more than once, as in this example, which pairs
 each item in a vector with itself ([A] = [(A,A)]):

foo.map(|x| (x, x))

 In fact, we *could* do away with underscores altogether, although I find
 them more readable (they highlight those portions of the function call that
 come from arguments vs the environment).

 The immediate motivation for this is that I am having troubles with some
 code due to complications introduced by bind.  I'd rather not fix those
 bugs.  I'd rather just delete bind.


 NIko
 __**_
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/**listinfo/rust-devhttps://mail.mozilla.org/listinfo/rust-dev


Hello,

I must admit I find the latter example quite pleasing: foo.map(|x| (x, x))
and (unlike you it seems) finds that it could completely replace _.

The problem with _ is that though it seems nice enough if there is one, but
it is not as immediate for the reader to determine how many parameters the
resulting function has. We could number them (_0, _1, _2) but it makes
maintenance painful (if you remove _1, you have to remain all those which
followed).

I find that explicitly naming the arguments in between the pipes really
help making it clear how many arguments the created function has.

Of course, it still does nothing with the problem of shadowing an outer
element meant to be captured. Not sure if there is a way to deal with that
at the same time...

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Back to errors, failures and exceptions

2012-05-25 Thread Matthieu Monrocq

On Fri, May 25, 2012 at 7:16 PM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 On Fri May 25 18:01:25 2012, Patrick Walton wrote:
  On 05/25/2012 08:43 AM, Kevin Cantu wrote:
  This conversation reminds me of Alexandrescu's talk about D's scope
  keyword:
 
 http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Three-Unlikely-Successful-Features-of-D
 
 
  It looks like a graceful syntax for handling all sorts of nested error
  and failure cleanup...
 
  I like the scope keyword, FWIW. It'd be even better if you didn't have
  to provide a variable name if all you want to do is execute some code
  at the end of the block.
 
  This would provide a facility like Go's defer keyword, but more
  general since it also admits C++ RAII patterns.
 
  Patrick

 What's the difference between |scope| and Rust's resources, exactly?

 Cheers,
  David

 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla


 Regarding adding logs to the errors:

- Boost.Exception has something similar: you can add class instances to the
exception using the error info mechanism [1]
- It also reminds me of what happens in case of failures using the note
expressions, I believe the same notes could be reused in case of
exceptions/errors to provide additional logs.

Of course, there is a difference between the two schemes. Boost's is
somewhat more powerful because it does not consist of adding simple strings
but full-blown objects (which could be conditionned to be printable), and
thus allow inspection of structured data at the error-handling site. It may
be thought of as overkill too...


Regarding D's scope keyword [2]

There are several statements based on it:

 - scope(exit)  where  is executed on exit, no matter what
 - scope(failure)  where  is executed on exit if the previous
statement failed
 - scope(success)  where  is executed on exit if the previous
statement succeeded

On the other hand, it kind of look like a hack, maybe it is an issue of
getting used to it though.


 [1]:
http://www.boost.org/doc/libs/1_49_0/libs/exception/doc/error_info.html
 [2]: http://dlang.org/exception-safe.html
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Back to errors, failures and exceptions

2012-05-23 Thread Matthieu Monrocq

On Wed, May 23, 2012 at 2:47 PM, David Rajchenbach-Teller 
dtel...@mozilla.com wrote:

 Actually, one of the conclusions of our previous discussion is that
 Java/C++/ML-style exceptions are probably not what we want for Rust. I
 seem to remember that we also concluded that using failures as
 exceptions was probably not the right course.

 Hence this new thread :)

 Let me put together what I believe are a few desirable qualities of an
 issue management system. For the moment, let's not wonder whether that
 system is a language feature, a library or a coding guideline.


As a whole, this looks very good to me, I just have one quick question:


 * The system _must_ not prevent developers from calling C code from Rust.
 * The system _must_ not prevent developers from passing a pointer to a
 Rust function to C code that will call back to it.
 * The system _must_ not prevent, some day, developers from calling Rust
 from JavaScript.
 * The system _must_ not prevent, some day, developers from calling
 JavaScript from Rust.
 * Issues _must_ not be restricted to integers (or to one single type).


Could you explain what you mean by this ?

I suppose this is a direct jab at the horror that is errno and more in the
direction of being able to throw anything (possibly at the condition it
implements a given interface) ?


 * The default behavior in case of untreated issue _should_ be to
 gracefully kill the task or the application.
 * Whenever an untreated issue kills a task/application, it _should_
 produces a report usable by the developer for fixing the issue.
 * It _should_ be possible to deactivate that killing behavior. There
 _may_ be limitations.
 * It _should_ be possible to deactivate that killing behavior
 conditionally (i.e. only for some errors).
 * The system _should_ eventually have a low runtime cost – in
 particular, the case in which no killing happens should be very fast.


 Do we agree on this base?

 Cheers,
  David

 On 5/22/12 4:56 AM, Bennie Kloosteman wrote:
  Are exceptions a good model for systems programming ?
 
  - legacy C programs cant call you without a wrapper which translates
  all possible exceptions
  - unwinding a stack is probably not a good idea in a kernel or when
  you transition into protected/user mode.( I know of very few kernels
  that use exceptions ).
  - Its not just legacy , Winrt uses C++ classes but returns error codes
  tor low level APIs.
 
  However its very nice for user programs . These days these different
  worlds works quite well , c libs which Is mainly used for system
  programming don't use them  and C++ apps are more user programs and
  they do  , C++ calls C , C rarely calls C++. Obviously if you write a
  kernel or shared library  you cannot use exceptions if  c programs
  call your code and there is a lot of c out there While not really
  an issue for the language ( just dont use exceptions) it means a
  standard lib that throws an exception would be a pain for such work
  and you would need a different standard lib , which is an issue .
 
  BTW could Rust use tasks as a substitute for exception scopes ? Tasks
  have error bubbling , hiding  , stack unwinding  , throw ( fail) and
  should have integrated logging . You could put a sugar syntax around
  it but it would still work when being called by c.  Also with tasks
  you can cancel or do timeouts giving asynronous exceptions  which are
  really needed ( eg in most systems cancel a long running task is very
  anoying very long pause). and which most trivial exception
  implementations don't do ..Not sure if this is the right way but there
  seems a lot of overlap and it would work with C and systems
  programming,.
 
  Ben
  ___
  Rust-dev mailing list
  Rust-dev@mozilla.org
  https://mail.mozilla.org/listinfo/rust-dev


 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla


 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Interesting paper on RC vs GC

2012-05-01 Thread Matthieu Monrocq

On Tue, May 1, 2012 at 5:51 PM, Sebastian Sylvan sebastian.syl...@gmail.com
 wrote:

 On Tue, May 1, 2012 at 4:07 AM, Matthieu Monrocq
 matthieu.monr...@gmail.com wrote:
  As a consequence, I am unsure of the impact this article should have on
  Rust's GC design. The implementation strategies presented are very clear
 and
  the advantages/drawbacks clearly outlined, which is great (big thank you
 to
  the OP); however the benchmarks and conclusions might be a tad
 Java-centric
  and not really apply to Rust's more advanced type system.
 

 My conjecture is that Java is *especially* unsuitable for RC for the
 following reasons:
 * lots of references, thus lots of reference traffic, thus lots of ref
 count inc/dec.
 * lots of garbage, thus more expensive for an algorithm for which the
 cost is proportional to the amount of garbage (RC) rather than to the
 amount of heap (GC).

 So I'd expect vanilla RC to do better in comparison to GC (though
 perhaps not beat it) in Rust than in Java. Applying the optimizations
 mentioned in the article (most of which rely on using deferred ref
 counting, which does mean you give up on predictable timing for
 deallocations) may make RC significantly better in Rust.

 Seb


I agree that the technics outlined, especially with the details on their
advantages/drawbacks are a very interesting read.

As for the predictable timing, anyway it seems hard to have something
predictable when you take cycle of references into account: I do not know
any inexpensive algorithm to realize that by removing a link you are
suddenly creating a self-sustaining group of objects that should be
collected. Therefore I would venture that anyway such groups would be
collected in a deferred fashion (using some tracing algorithm).

-- Matthieu


  ---
 
  Finally I think it might be worth considering having two distinct GC
  strategies:
  - one for immutable objects (that only references other immutable
 objects)
  - one for the rest (mutable objects with potential cycles)
 
  I see no reason to try and employ the same strategy for such widely
  different profiles other than the potential saving in term of coding
 effort.
  But then trying to cram every single usecase in a generic algorithm
 while
  keeping it efficient seems quite difficult too, whereas having several
  specialized mechanisms might make for much clearer code.
 
  One idea I have toyed with for my own was to have simple stubs: design a
  clear API for GC, with two (or more) sets of functions for example here,
 and
  call those functions instead of inlining their effect (in the IR). By
  providing the functions definitions externally (but inlining them in
 each IR
  module) this makes it easy to switch back and forth between various
  implementations whilst still retaining the efficiency of the LLVM
 backend to
  inline/optimize the calls.
 
  This means one can actually *test* the strategies, and perhaps even let
 the
  user *choose* which one better suits her needs. Of course coherency at
 the
  executable level might be necessary.
 
  -- Matthieu



 --
 Sebastian Sylvan

___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] In favor of types of unknown size

2012-04-29 Thread Matthieu Monrocq

Hi Sebastian,

I have a few comments.

On Sun, Apr 29, 2012 at 12:21 AM, Sebastian Sylvan 
sebastian.syl...@gmail.com wrote:

 On Fri, Apr 27, 2012 at 3:15 PM, Niko Matsakis n...@alum.mit.edu wrote:
  The types `[]T` and `str` would represent vectors and strings,
  respectively.  These types have the C representation `rust_vecT` and
  `rust_vecchar`.  They are of *dynamic size*, meaning that their size
  depends on their length.  The literal form for vectors and strings are
  `[a, b, c]` and `foo`, just as normal.

 Back when I was entertaining the idea of writing my own rust-like
 language (before I was aware of rust's existence), I had the idea that
 all records/objects cold have dynamic size if any of their members had
 dynamic size (and the root cause of dynamic size would be fixed-size
 arrays - fixed at the time of construction, not a static constant
 size).

 This is only slightly related, but it's too close that I can't resist
 presenting gist of it (it's not completely worked out), in case anyone
 else wants to figure it out and see if it makes sense :-)

 Basically the idea spawned from the attempt of trying to avoid
 pointers as much as possible. Keep things packed, with chunky
 objects, reduce the complexity for GC/RC, reduce memory fragmentation,
 etc.. Aside from actual honest-to-goodness graphs (which fairly rare,
 and most are small, and unavoidable anyway). The conjecture is that
 the main source of pointers are arrays.

 Okay, so basically the idea is that arrays are length-prefixed blocks
 of elements. They're statically sized (can't be expanded), but you can
 pass in a dynamic, non-constant value when you construct them. Unlike
 C/C++ though these arrays can still live *inside* an object. There's
 some fiddlyness here.. e.g.. do you put all arrays (except ones which
 true const sizes?) at the end of the object so other members have a
 constant offset? If you have more than a small number of arrays in an
 object it probably makes to have a few pointers indicating the start
 of each instead of having to add up the sizes of preceeding arrays
 each time an access is made to one of the later arrays.

 Small reactions on pointers: I think it's a good idea to pack the
variable length structures at the end of the current object. However I
would use cumulative offsets rather than pointers, because of size (on
64-bits architecture, which are becoming the de-facto standard for PCs and
servers).

The idea would be in C-style:

struct Object {
int scalar1;
int scalar2;
unsigned __offset0;
unsigned __offset1;
unsigned __offset2;
SomeObject __obj0;
Table __obj1[X];
};

Where __offset0 indicates the offset from the start of Object to the start
of __obj0, __offset1 the offset from the start of Object to the start of
__obj1 and __offset2 the offset from the start of Object to the start of
__obj2. This means you have direct access to any attribute with a simple
addition to pointer, and you can know the size with a simple substraction
(the size of __obj0 is __offset1 - __offset2).



 So, during Construction of an object, you'd have to proceed in two
 phases. First is the constructor logic where you compute values, and
 the second is the allocation of the object and moving the values into
 it. You need to hold off on allocation because you don't know the size
 of any member objects until you've constructed them. Moving an array
 is now expensive, since it requires a copy, not just a pointer move.
 So ideally the compiler would try to move the allocation to happen as
 early as possible so most of the values can be written directly to its
 final location instead of having to be constructed on the stack (or
 heap) and then moved. There are of course cases where this couldn't be
 done. E.g. if the size of an array X, depends on some computation done
 on array Y in the same object - you have to create Y on the stack, or
 heap, to run the computation before you can know the total size of the
 object, and only then can you allocate the final object and copy the
 arrays into it.

 Yes, this is getting quite difficult at this stage. It's good once the
size is settled but the construction can be expensive.


 I'm not 100% sold on the idea, since it does make things a bit more
 complex, but it is pretty appealing to me that you can allocate
 dynamic-but-fixed sized arrays on the stack, inside other objects
 etc.. For a language that emphasizes immutable data structures I'd
 imagine the opportunity to use these fixed arrays in-place would be
 extremely frequent.

Seb

 --
 Sebastian Sylvan



There is a subtle issue that I had not remarked earlier. This mechanism
works great for fixed-size arrays, but is not amenable to extensible
arrays: vectors and strings *grow*.

So it would work if the field/attribute is runtime-fixed-size, either
because the type imposes it or because it's declared immutable, however it
will not work in the general case.

This is important because it means that in

Re: [rust-dev] In favor of types of unknown size

2012-04-28 Thread Matthieu Monrocq

On Sat, Apr 28, 2012 at 8:12 AM, Marijn Haverbeke mari...@gmail.com wrote:

 I must say I prefer Graydon's syntax. `[]T` sets off all kinds of
 alarms in my head.

 I have no strong opinion on dynamically-sized types. Not having them
 is definitely a win in terms of compiler complexity, but yes, some of
 the things that they make possible are nice to have.
 ___
 Rust-dev mailing list
 Rust-dev@mozilla.org
 https://mail.mozilla.org/listinfo/rust-dev


Hello Niko,

First I really appreciate you thinking hard about it and if you don't want
to bother the list I would certainly not mind talking it out with you in
private; I feel it's very important for these things to be thought through
extensively and I really like that decisions in Rust are always considered
carefully and objectively.

That being said, I have two remarks:


I would like to ask a question on the vectors syntax: why the focus on [] ?

I understand it in the literal form, however a string type is denoted as
`str` so why not denote a vector of Ts as `vecT` ? Yes, it's slightly
more verbose, but this is how all the other generic types will be expressed
anyway. Similarly, since a substring is expressed as `substr`, one could
simply express a slice as `sliceT` or `svecT` or even `array_refT`.

I don't think being overly clever with the syntax type will really help the
users. Imagine grepping for all uses of the slice type in a crate ? It's so
much simpler with an alphabetic name.

(Also,  `[:]/r T` feel *really* weird, look at the mess C is with its
pointer to function syntax that let's you specify the name in the *middle*
of the type...)


As for types of unknown sizes, I would like to point out that prevent users
from having plain `str` attributes in their records is kinda weird. The
pointer syntax is not only more verbose, it also means that suddenly
getting a local *copy* of the string gets more difficult. Sure it's
equivalent (semantically) to a unique pointer `~str`, but it does not make
copying easier, while it's one of the primary operations in impure
languages (because the original may easily get modified at a later point in
time).

I think that `rust_vecT` having an unknown size rather than being (in
effect) a pointer to a heap allocated structure is nice from an
implementation point of view, but it should not get in the way of using it.
I would therefore venture that either it has an unknown size and the
compiler just extend this unknown size property to all types so they can
have `vecT` and `str` attributes naturally, or it's better for it *not*
to have an unknown size.

I would also like to point out that if it's an implementation detail, the
actual representation might vary from known size to unknown size without
impact for the user, so starting without for the moment because it's easier
and refining it later is an option. Another option is to have a fixed size
with an alternative representation using something similar to SSO (Short
String Optimization); that is small vectors/strings allocate their storage
in place while larger ones push their storage to the heap to avoid trashing
the stack.


Hope this does not look harsh, I sometimes have difficulties expressing my
opinions without being seen as patronizing: I can assure you I probably
know less than you do :)

-- Matthieu
___
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Syntax of vectors, slices, etc

2012-04-25 Thread Matthieu Monrocq

On Tue, Apr 24, 2012 at 11:24 PM, Graydon Hoare gray...@mozilla.com wrote:

 On 12-04-24 11:30 AM, Matthieu Monrocq wrote:

  However this is at the condition of considering strings as list of
  codepoints, and not list of bytes. List of bytes are useful in encoding
  and decoding operations, but to manipulate Arabic or Korean, they fall
  short: having users manipulate the strings byte-wise instead of
  codepoint-wise is a recipe to disaster outside of English and Latin-1
  representable languages.

 Could you elaborate on this a little bit? I'm curious to hear
 impressions -- even if vague or hard to specify -- about the experience
 of working with known-language, non-Latin-1 text. I'm an English-speaker
 and much technical material is English-derived, so usually when I'm
 working with text-processing code, it falls into one of two categories:

  - ASCII-subset by construction (eg. structured-language keywords)

  - Totally unknown language semantics, has to work with everything,
can't assume I know anything about the language (eg. human input)

 I am emphatically not saying these are the _only_ two possible
 environments, just the two that I have experience in. So in my
 experience byte-operations in ASCII range works for the former and using
 a proper language-and-locale-aware unicode library like ICU works for
 the latter. That's where my usability biases emerge in the design of str.

 In particular I want to know if you would feel that there are common
 operations you expect to be able to do codepoint-at-a-time on the
 datatype str, that you would not be comfortable doing on the datatype
 [char], if you converted str to [char] as a one-time pass in advance
 of performing the operation. That's what I assume people will do if they
 need random (rather than sequential) codepoint access. Sequential access
 we already have iterators for.

 But I understand this might not be right; it's a design space with a lot
 of tensions. There are as many different string representations in the
 world as there are opinionated programmers :)


  I understand that this may seem contradictory to Rust's original
  direction of utf-8 encoded strings, but having worked with utf-8 strings
  using C++ `std::string` I can assure you that apart from blindly passing
  them around, one cannot do much. All modifiying operations require the
  use of Unicode aware libraries... even `substr`.

 Naturally so. We're intending to ship a relatively full binding to
 libicu for just this reason. Unicode Text Is Hard To Do By Hand.

 (Though, hmm, substr is actually fine on UTF-8, no? You just have to
 land on character boundaries. Which are easy to find; O(1) from any
 given start point -- at most 5 bytes away -- and the guaranteed output
 of any other algorithm that iterates over character boundaries...)


Thanks for this answer: I had not considered the ability to do a str -
[char] - str with actual Unicode work on the [char] type.

I also did not know about the intent of integrating a subset of libicu.
Indeed with a full library handling [char] correctly, and two simple
facilities to convert back and fro, then it would be trivial for the user
to use real Unicode operations (to_lower / to_upper / capitalize are not
fun :x) without too much hassle.

Regarding the use cases I have encountered, they were in a general public
web app:
- wrap-around at a specified length (in number of graphemes, which in the
appropriate canonical form was the number of codepoints in all the
languages we cared for)
- truncation at a specified length (also in number of graphemes)
- sorting lists (the first time we presented a list of countries in Greek,
it was nigh unusable...)

Pretty basic operations, we used ICU for sorting (collation) and conversion
to 32bits unicode codepoint value for length operations.

It was all the more funny with Arabic, of course, because of the control
characters for the direction of display which do not have a graphical
representation, but since we counted by hand, we just ignored them.


  Second, I do not think that statically known sizes are so important in
  the type system. I am a huge fan, and abuser, of the C++ template
  system, but I will be the first to admit it is really complex and
  generally poorly understood even among usually savvy C++ users.
 
  As I understand, fixed-length vectors were imagined for C-compatibility.
  Statically allocated buffers have lifetime that exceed that of all other
  objects in the system, therefore they can perfectly be accessed through
  slices. Other uses implying C-compatibility should be based on
  dynamically allocated memory, and the size will be unknown at
 compilation.

 They're useful for a lot of reasons. You can alloca them, which is good
 for small buffers. And a decent number of heap structures also have need
 of small fixed-fanout arrays, caches, lookup tables and the like.

 But beyond that they simply _occur_ in the C type system. With annoying
 frequency! We've

Re: [rust-dev] Syntax of vectors, slices, etc

2012-04-24 Thread Matthieu Monrocq

Hello,

As this is going to be my first e-mail on this list, please do not hesitate
to correct me if I speak out of turn.
Also do note that I am not a native English speaker, I still promise to do
my best and I will gladly welcome any correction.



First, I agree that operations on vectors and strings are mostly similar.

However this is at the condition of considering strings as list of
codepoints, and not list of bytes. List of bytes are useful in encoding and
decoding operations, but to manipulate Arabic or Korean, they fall short:
having users manipulate the strings byte-wise instead of codepoint-wise is
a recipe to disaster outside of English and Latin-1 representable languages.

I understand that this may seem contradictory to Rust's original direction
of utf-8 encoded strings, but having worked with utf-8 strings using C++
`std::string` I can assure you that apart from blindly passing them around,
one cannot do much. All modifiying operations require the use of Unicode
aware libraries... even `substr`.


Second, I do not think that statically known sizes are so important in the
type system. I am a huge fan, and abuser, of the C++ template system, but I
will be the first to admit it is really complex and generally poorly
understood even among usually savvy C++ users.

As I understand, fixed-length vectors were imagined for C-compatibility.
Statically allocated buffers have lifetime that exceed that of all other
objects in the system, therefore they can perfectly be accessed through
slices. Other uses implying C-compatibility should be based on dynamically
allocated memory, and the size will be unknown at compilation.


In the blog article linked, an issue regarding the variable-size of
`rust_vecT` is made because it plays havoc with stack-allocation.
However, is real stack-allocation necessary here ? It seems to me that was
is desirable is the semantic aspect of a scope-bound variable. Whether the
actual representation is instantiated on the stack or on the task heap is
an implementation detail, and the compiler could perfectly well be enhanced
such that all variably-sized types are actually instantiated on the heap,
but automatically collected at the end of the function scope. A parallel
stack dedicated to such allocations could even be used, as the
allocation/deallocation pattern is stack-like.


I hope my suggestions are reasonable. Do feel free to ignore them if they
are not!

-- Matthieu


On Tue, Apr 24, 2012 at 2:06 AM, Niko Matsakis n...@alum.mit.edu wrote:

 Some more thoughts on the matter:

 http://smallcultfollowing.com/**babysteps/blog/2012/04/23/**
 vectors-strings-and-slices/http://smallcultfollowing.com/babysteps/blog/2012/04/23/vectors-strings-and-slices/

 Niko


 On 4/23/12 4:40 PM, Niko Matsakis wrote:

 One thing that is unclear to me is the utility of the str/N type.  I
 can't think of a case where a *user* might want this type---it seems to me
 to represent a string of exactly N bytes (not a buffer of at most N bytes).
  Graydon, did you have use cases in mind?


 Niko

 On 4/23/12 4:12 PM, Graydon Hoare wrote:

 On 12-04-23 03:21 PM, Rick Richardson wrote:

 Should a str be subject to the same syntax? Because it will have
 different semantics.

 I think the semantics are almost identical to vectors. Save the null
 issue.

  A UTF-8 string  has differently sized characters, so you can't treat
 it as a vector, there are obvious and currently discussed
 interoperability issues regarding the null terminator.

 You certainly can treat it as a (constrained) vector. It's just a byte
 vector, not a character vector. A character vector is [char]. Indexing
 into a str gives you a byte. You can iterate through it in terms of
 bytes or characters (or words, lines, paragraphs, etc.) or convert to
 characters or utf-16 code units or any other encoding of unicode.

  It should definitely get a slice syntax, since that will likely be the
 most common operation on a string.
 I would also like to support a notion of static sizing, but with UTF-8
 even that's not always possible.

 Yes it is. The static size is a byte count. The compiler knows that size
 statically and can complain if you get it wrong (or fill it in if you
 leave it as a wildcard, as I expect most will do.)

  I reckon a string should be an object, and potentially be convertible
 to/from a vector.  But trying to treat it like a vector will just lead
 to surprising semantics for some.  But that's just my opinion.

 The set of use-cases to address simultaneously is large and covers much
 of the same ground as vectors:

   - Sometimes people want to be able to send strings between tasks.
   - Sometimes people want a shared, refcounted string.
   - Sometimes people want strings of arbitrary length.
   - Sometimes people want an interior string that's part of another
 structure (with necessarily-fixed size), copied by value.
   - String literals exist and ought to turn into something useful,
 something in static memory

94 matches

Mail list logo