Re: Rust extensions: the next step

2018-10-19 Thread Yuya Nishihara
On Thu, 18 Oct 2018 19:15:06 +0200, Georges Racinet wrote:
> On 10/18/2018 04:09 PM, Yuya Nishihara wrote:
> > I expect "rustext" (or its upper layer) to be a shim over Rust-based modules
> > and cexts. So if you do policy.importmod('parsers'), it will return
> > cext.parsers, whereas policy.importmod('ancestor') will return 
> > rustext.ancestor,
> > though I have no idea if there will be cext/pure.ancestor.
> Yes, it's quite possible to add a new module policy this way. After all,
> from mercurial.policy, it behaves in the same way as the cext package
> does and the fact that we have a single shared library instead of
> several ones is an implementation detail, hidden by Python's import
> machinery.
> 
> But this opens another, longer term, question: currently what I have in
> mercurial.rustext.ancestor has only a fragment of what
> mercurial.ancestor provides. Therefore to have mercurial.policy handle
> it, we'll need either to take such partial cases into account, or decide
> to translate the whole Python module in Rust. For the time being, I'm
> simply doing an import and catch the error to fallback to the Python
> version.

That could be handled by policy._modredirects, e.g.

  _modredirects = {
  ('rustext', 'parsers'): ('cext', 'parsers'),
  ('cext', 'ancestor'): ('pure', 'ancestor'),
  # and move pure-python implementation to pure/ancestor.py
  }

But yeah, it will depend on the number of redirects whether doing that will
make things clearer or not. We can decide that later.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Rust extensions: the next step

2018-10-18 Thread Georges Racinet
On 10/18/2018 12:22 PM, Gregory Szorc wrote:
> One open item is full PyPy/cffi support. Ideally we’d only write the native 
> code interface once. But I think that means cffi everywhere and last I 
> looked, CPython into cffi was a bit slower compared to native extensions. I’m 
> willing to ignore cffi support for now (PyPy can use pure Python and rely on 
> JIT for faster execution). Maybe something like milksnake can help us here? 
> But I’m content with using the cpython crate to maintain a Rust-based 
> extension: that’s little different from what we do today and we shouldn’t let 
> perfect be the enemy of good.

One nice thing with the cpython crate is that it's just using the
CPython ABI. Therefore, there's nothing we can't do – only things that
are less practical. It's not very intuitive, but it should be ok with a
bit of practice.

About cffi, if milksnake can automate it, that's an easy win to be added
later (for now I still need to call in the C modules from the Rust code).

In both cases, we need to tighten it with comprehensive integration tests.

Cheers,

-- 
Georges Racinet
Anybox SAS, http://anybox.fr
Téléphone: +33 6 51 32 07 27
GPG: B59E 22AB B842 CAED 77F7 7A7F C34F A519 33AB 0A35, sur serveurs publics


___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Rust extensions: the next step

2018-10-18 Thread Georges Racinet
On 10/18/2018 04:09 PM, Yuya Nishihara wrote:
> On Thu, 18 Oct 2018 08:58:04 -0400, Josef 'Jeff' Sipek wrote:
>> On Thu, Oct 18, 2018 at 12:22:16 +0200, Gregory Szorc wrote:
>> ...
>>> Something else we may want to consider is a single Python module exposing
>>> the Rust code instead of N. Rust’s more aggressive cross function
>>> compilation optimization could result in better performance if everything
>>> is linked/exposed in a single shared library/module/extension. Maybe this
>>> is what you are proposing? It is unclear if Rust code is linked into the
>>> Python extension or loaded from a shared shared library.
>> (Warning: I suck at python, aren't an expert on rust, but have more
>> knowledge about ELF linking/loading/etc. than is healthy.)
>>
>> Isn't there also a distinction between code layout (separate crates) and the
>> actual binary that cargo/rustc builds?  IOW, the code could (and probably
>> should) be nicely separated but rustc can combine all the crates' code into
>> one big binary for loading into python.  Since it would see all the code, it
>> can do its fancy optimizations without impacting code readability.
> IIUC, it is. Perhaps, the rustext is a single binary exporting multiple
> submodules?
Yes totally, it's exactly as Josef writes. To demonstrate, here's what I
have :

$ ls mercurial/*.so
mercurial/rustext.so  mercurial/zstd.so
$ python
Python 2.7.13 (default, Nov 24 2017, 17:33:09)
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from mercurial import rustext
>>> dir(rustext)
['GraphError', '__doc__', '__file__', '__name__', '__package__',
'ancestors']
>>> from mercurial.rustext import ancestors
>>> ancestors is rustext.ancestors
True
>>> dir(ancestors)
['AncestorsIterator', '__doc__', '__name__', '__package__']

So, in short, it's a single shared library that can hold a bunch of
modules. The submodules are themselves initialized from the Rust code.
Here's the definition of 'rustext' itself. It follows the pattern
expected by Josef.

$ tail rust/hg-cpython/src/lib.rs
mod ancestors;  // corresponds to src/ancestors.rs
mod exceptions;

py_module_initializer!(rustext, initrustext, PyInit_rustext, |py, m| {
    m.add(py, "__doc__", "Mercurial core concepts - Rust implementation")?;

    m.add(py, "ancestors", ancestors::init_module(py)?)?;
    m.add(py, "GraphError", py.get_type::())?;
    Ok(())
});

(Mark confirmed to me during the sprint that adding submodules on the
fly was doable).

Indeed I hope the Rust compiler can do lots of optimizations in that
single shared library object.
>
> I expect "rustext" (or its upper layer) to be a shim over Rust-based modules
> and cexts. So if you do policy.importmod('parsers'), it will return
> cext.parsers, whereas policy.importmod('ancestor') will return 
> rustext.ancestor,
> though I have no idea if there will be cext/pure.ancestor.
Yes, it's quite possible to add a new module policy this way. After all,
from mercurial.policy, it behaves in the same way as the cext package
does and the fact that we have a single shared library instead of
several ones is an implementation detail, hidden by Python's import
machinery.

But this opens another, longer term, question: currently what I have in
mercurial.rustext.ancestor has only a fragment of what
mercurial.ancestor provides. Therefore to have mercurial.policy handle
it, we'll need either to take such partial cases into account, or decide
to translate the whole Python module in Rust. For the time being, I'm
simply doing an import and catch the error to fallback to the Python
version.

Regards,

-- 
Georges Racinet
Anybox SAS, http://anybox.fr
Téléphone: +33 6 51 32 07 27
GPG: B59E 22AB B842 CAED 77F7 7A7F C34F A519 33AB 0A35, sur serveurs publics


___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Rust extensions: the next step

2018-10-18 Thread Yuya Nishihara
On Thu, 18 Oct 2018 08:58:04 -0400, Josef 'Jeff' Sipek wrote:
> On Thu, Oct 18, 2018 at 12:22:16 +0200, Gregory Szorc wrote:
> ...
> > Something else we may want to consider is a single Python module exposing
> > the Rust code instead of N. Rust’s more aggressive cross function
> > compilation optimization could result in better performance if everything
> > is linked/exposed in a single shared library/module/extension. Maybe this
> > is what you are proposing? It is unclear if Rust code is linked into the
> > Python extension or loaded from a shared shared library.
> 
> (Warning: I suck at python, aren't an expert on rust, but have more
> knowledge about ELF linking/loading/etc. than is healthy.)
> 
> Isn't there also a distinction between code layout (separate crates) and the
> actual binary that cargo/rustc builds?  IOW, the code could (and probably
> should) be nicely separated but rustc can combine all the crates' code into
> one big binary for loading into python.  Since it would see all the code, it
> can do its fancy optimizations without impacting code readability.

IIUC, it is. Perhaps, the rustext is a single binary exporting multiple
submodules?

I expect "rustext" (or its upper layer) to be a shim over Rust-based modules
and cexts. So if you do policy.importmod('parsers'), it will return
cext.parsers, whereas policy.importmod('ancestor') will return rustext.ancestor,
though I have no idea if there will be cext/pure.ancestor.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Rust extensions: the next step

2018-10-18 Thread Josef 'Jeff' Sipek
On Thu, Oct 18, 2018 at 12:22:16 +0200, Gregory Szorc wrote:
...
> Something else we may want to consider is a single Python module exposing
> the Rust code instead of N. Rust’s more aggressive cross function
> compilation optimization could result in better performance if everything
> is linked/exposed in a single shared library/module/extension. Maybe this
> is what you are proposing? It is unclear if Rust code is linked into the
> Python extension or loaded from a shared shared library.

(Warning: I suck at python, aren't an expert on rust, but have more
knowledge about ELF linking/loading/etc. than is healthy.)

Isn't there also a distinction between code layout (separate crates) and the
actual binary that cargo/rustc builds?  IOW, the code could (and probably
should) be nicely separated but rustc can combine all the crates' code into
one big binary for loading into python.  Since it would see all the code, it
can do its fancy optimizations without impacting code readability.

Jeff.

-- 
C is quirky, flawed, and an enormous success.
- Dennis M. Ritchie.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Rust extensions: the next step

2018-10-18 Thread Gregory Szorc


> On Oct 17, 2018, at 18:45, Georges Racinet  wrote:
> 
> Hi all,
> 
> first, many thanks for the Stockholm sprint, it was my first interaction
> with the Mercurial community, and it's been very welcoming to me.
> 
> I've been pursuing some experiments I started then to convert the Rust
> bindings I've done in the patch series about ancestry iteration (now
> landed) to a proper Python extension, using the cpython crate and Python
> capsules. In short, it works.
> 
> Early benchmarking shows that it's a few percent slower than the direct
> bindings through C code, which I think is acceptable compared to the
> other benefits (clearer integration, easier to generalise, no C code at
> all).
> 
> The end result is a unique shared library importable as
> 'mercurial.rustext', which is itself made of several submodules, ie, one
> can do:
> 
>from mercurial.rustext.ancestor import AncestorsIterator

This all sounds very reasonable to me.

One open item is full PyPy/cffi support. Ideally we’d only write the native 
code interface once. But I think that means cffi everywhere and last I looked, 
CPython into cffi was a bit slower compared to native extensions. I’m willing 
to ignore cffi support for now (PyPy can use pure Python and rely on JIT for 
faster execution). Maybe something like milksnake can help us here? But I’m 
content with using the cpython crate to maintain a Rust-based extension: that’s 
little different from what we do today and we shouldn’t let perfect be the 
enemy of good.

Something else we may want to consider is a single Python module exposing the 
Rust code instead of N. Rust’s more aggressive cross function compilation 
optimization could result in better performance if everything is linked/exposed 
in a single shared library/module/extension. Maybe this is what you are 
proposing? It is unclear if Rust code is linked into the Python extension or 
loaded from a shared shared library.

> 
> It will take me some more time, though, to get that experiment into a
> reviewable state (have to switch soon to other, unrelated, works) and
> we're too close to the freeze anyway, but if someone wants to see it, I
> can share it right away.
> 
> Also, I could summarize some of these thoughts on the Oxidation wiki
> page. Greg, are you okay with that ?

Yes, please update the Oxidation wiki! I’ve been meaning to update it with 
results of discussions at the sprint. I’ve just been busy trying to finish my 
patches for 4.8...

> 
> Regards,
> 
> -- 
> Georges Racinet
> Anybox SAS, http://anybox.fr
> Téléphone: +33 6 51 32 07 27
> GPG: B59E 22AB B842 CAED 77F7 7A7F C34F A519 33AB 0A35, sur serveurs publics
> 
> 
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel