D8098: rust-nodemap: a method for full invalidation

2020-02-24 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:577
> +/// immutable part.
> +pub fn invalidate_all(&mut self) -> () {
> +self.root = Block::new();

`-> ()` is unnecessary.

> nodemap.rs:580
> +self.growable = Vec::new();
> +self.masked_inner_blocks = self.readonly.len()
> +}

End the line with a `;`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8098/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8098

To: marmoute, #hg-reviewers, kevincox
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-02-24 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:191
>  fn new() -> Self {
> -Block([-1; 16])
> +Block([255u8; BLOCK_SIZE])
>  }

A comment why 255 is a good number would be nice.

> nodemap.rs:334
> +// Assert that `Block` hasn't been changed and has no padding
> +let _: [u8; 4 * mem::size_of::()] =
> +std::mem::transmute([Block::new(); 4]);

This should be `4 * BLOCK_SIZE` to match the code below.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox, durin42
Cc: yuja, Alphare, marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7797: rust-nodemap: pure Rust example

2020-02-18 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in index.rs:76
> I am not sure exactly what the `debug_assert` should be? Do you want to check 
> that the `len` is a multiple of the `align_of::`?

Yes, that seems reasonable. It shows what data you are ignoring and will alert 
us if we manage to hit that case in tests.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7797/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7797

To: gracinet, #hg-reviewers, marmoute, kevincox
Cc: Alphare, marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-02-14 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in nodemap.rs:268
> I think that would be sufficient indeed. How about making `Block` into a 
> `[u8; 64]`? It will be less expressive, but this will no longer be needed.

I like that. It makes it much more explicit that it is a backing array and 
forces us to have proper endiness hygiene.

I'm also willing to assume that `[u8; 64]` will be 64 bytes with no padding 
forever.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox, durin42
Cc: yuja, Alphare, marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8110: rust-dirstatemap: cache non normal and other parent set

2020-02-14 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in dirstate_map.rs:251
> `as_mut()` only does `&mut Option -> Option<&mut T>`, which isn't always 
> what we want (however you can argue that it would save a few keystrokes), but 
> calling `unwrap()` circles back to the same `error[E0515]: cannot return 
> value referencing temporary value`.

It works fine for me: 
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f74f3771641d1e29bdd4de1444c43324

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8110/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8110

To: marmoute, #hg-reviewers, Alphare
Cc: Alphare, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7797: rust-nodemap: pure Rust example

2020-02-14 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> Alphare wrote in index.rs:76
> I'm not sure of the point of doing so with an mmap. We have to trust what the 
> mmap is giving us.

In general an assertion is better than a comment because it expresses the 
things that you are assuming in code. I get that "it should always be" which is 
why I didn't suggest an `assert` but I think it is still worth codifying the 
assumption with a `debug_assert`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7797/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7797

To: gracinet, #hg-reviewers, marmoute, kevincox
Cc: Alphare, marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-02-14 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in nodemap.rs:268
> @kevincox @durin42: I've changed the code to use the type-level 
> `ManuallyDrop` instead of `mem::forget`, added the `transmute` to be extra 
> paranoid and added a comment.

This doesn't address the primary problem of padding. There is nothing in the 
code that guarantees that there will be no padding inserted into `Block` 
(although it is unlikely to ever happen). This is because if there is padding 
it won't be guaranteed to be initialized and reading uninitialized data is 
undefined behaviour (not to mention that it will give incorrect results). I 
would like to add something that guarantees this. The only thing I can think of 
that will give us this confidence is a check `sizeof::() == 4 * 16`. I 
was suggesting the transmute to assert this (since transpose checks the size of 
its type arguments are the same. However as I originally suggested and as 
currently written it doesn't do that.

My current proposal is change `BLOCK_SIZE` to be defined as `4 * 16` (possibly 
extracting that 16 into a `ELEMENTS_PER_BLOCK`) and keep the current transmute 
assertion. That paired with some comments explaining why we are doing that will 
make me confident that there is no padding and we aren't performing any 
undefined behaviour.

> gracinet wrote in nodemap.rs:272
> Not sure to understand what you suggest here, since I'm already using 
> `Vec::from_raw_parts` in that method. I couldn't find an `into_raw_parts`.
> Anyway, the official example 
>   has the 
> `mem::forget` before  `Vec::from_raw_parts`. Would that be sufficient?
> 
> I agree that a leak upon some exceptional-cannot-happen condition is better 
> than a double free.
> 
> Also, forgot to came back to that one in the latest phab update

Ah, is missed that into_raw_parts is nightly-only at the moment. I think the 
forget is sufficient for now. Maybe leave a TODO to use into_raw_parts once 
stabilized.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox, durin42
Cc: Alphare, marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8110: rust-dirstatemap: cache non normal and other parent set

2020-02-14 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in dirstate_map.rs:251
> Unwrapping creates a temporary value that is dropped instantly, so unless I'm 
> not seeing something, I don't think we can.

You should be able to use `.as_mut()` just like you are currently doing in the 
caller.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8110/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8110

To: marmoute, #hg-reviewers, Alphare
Cc: Alphare, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8110: rust-dirstatemap: cache non normal and other parent set

2020-02-13 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> dirstate_map.rs:38
> +non_normal_set: Option>,
> +other_parent_set: Option>,
>  parents: Option,

I don't understand why an `Option` is faster than a `HashSet`. Could 
you add some explanation to the commit message? Is this to avoid attempting to 
initialize the entry multiple times?

> dirstate_map.rs:251
> +self.set_non_normal_other_parent_entries(false);
> +(&mut self.non_normal_set, &mut self.other_parent_set)
> +}

If we have just set the fields to Some(..) in the previous line can't we do the 
unwrap here where it is obviously correct?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8110/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8110

To: marmoute, #hg-reviewers
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7929: rust-status: add bare `hg status` support in hg-core

2020-02-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in status.rs:456
> `traverse` returns some `Cow::Owned`, I felt that it was simpler to share the 
> same return signature for all functions, but I can map the output, I guess.

I can definitely see that this is a judgement call. However 
`.map(Cow::Borrowed)` is easy enough to write that I think it makes sense in 
this case. However both ways are acceptable.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7929/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7929

To: Alphare, #hg-reviewers, kevincox
Cc: marmoute, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7927: rust-status: add util for listing a directory

2020-02-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in status.rs:76
> I didn't realize that `sort_unstable_by_key` needed to return a `T` and not a 
> `&T`, that forces a `clone()`. I'm not sure that matters, we'll go with the 
> shortest code and see if that shows up in profiling later.

`sort_unstable_by_key` doesn't require the function to return a T at all. It 
just needs to return an `Ord` type.  Note that `&Ord` is `Ord` automatically.

https://doc.rust-lang.org/std/primitive.slice.html#method.sort_unstable_by_key

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7927/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7927

To: Alphare, #hg-reviewers, marmoute, kevincox
Cc: marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7797: rust-nodemap: pure Rust example

2020-02-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> index.rs:6
> +
> +/// Minimal `RevlogIndex`, readable from standard Mercurial file format
> +use hg::*;

Since this is a file-level comment it should use `//!` 
https://doc.rust-lang.org/reference/comments.html#examples

> index.rs:47
> +if i >= self.len() {
> +None
> +} else {

I would make this an early return and remove the else. It seems to me like an 
unexpected condition. If it should never happen then please add a debug_assert 
as well.

> index.rs:76
> +let ptr = self.0.as_ptr() as *const IndexEntry;
> +// Any misaligned data will be ignored.
> +unsafe { slice::from_raw_parts(ptr, self.0.len() / INDEX_ENTRY_SIZE) 
> }

Could you add at least a debug_assert that the alignment is correct?

> index.rs:84
> +let file = File::open(path).unwrap();
> +let mmap = unsafe { MmapOptions::new().map(&file).unwrap() };
> +Self {

Please put the `unsafe` block around only the unsafe operation. This makes it 
more obvious what I should look at.

  let mmap = unsafe { MmapOptions::new().map(&file) }.unwrap();

> index.rs:84
> +let file = File::open(path).unwrap();
> +let mmap = unsafe { MmapOptions::new().map(&file).unwrap() };
> +Self {

It seems like we should handle this error. At the very least we should be 
providing context such as "your index file {:?} is missing".

> main.rs:8
> +extern crate hg;
> +extern crate memmap;
> +

We should be using Rust 2018 so externs aren't required.

> main.rs:41
> +fn create(index: &Index, path: &Path) -> io::Result<()> {
> +let mut file = File::create(path)?;
> +let start = Instant::now();

Do you create this at the start to avoid the work if you don't have permission? 
If not it seems to be that it would be better to avoid creating the file if 
creating the nodemap fails. Otherwise we are leaving an empty file around.  If 
so please document your reasoning.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7797/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7797

To: gracinet, #hg-reviewers, marmoute, kevincox
Cc: marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7929: rust-status: add bare `hg status` support in hg-core

2020-02-12 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> status.rs:201
>  root_dir: impl AsRef + Sync + Send,
> +work: mpsc::Sender<&'a HgPath>,
>  options: StatusOptions,

Please describe this parameter.

> status.rs:241
> +// The channel always outlives the sender, unwrap
> +work.send(normalized).unwrap()
> +} else {

This seems like something that should be handled in the caller. It is trivial 
for the caller to separate files from directories if they wish to do so.

> status.rs:275
> +.filter(|s| s.is_some())
> +.map(|s| s.unwrap())
>  }

I believe these two can be replaced with `.flatten()`

> status.rs:456
>  options: StatusOptions,
> -) -> impl ParallelIterator> {
> +) -> impl ParallelIterator, Dispatch)>> {
>  dmap.par_iter().map(move |(filename, entry)| {

Why was this changed when IIUC all of the returns are references? If a caller 
wants that I would prefer to do a map in the caller.

> status.rs:617
> +// Step 1: check the files explicitly mentioned by the user
> +let (tx, rx) = mpsc::channel();
> +results.par_extend(walk_explicit(files, &dmap, root_dir, tx, options));

tx and rx are bad names, please indicate what is being sent and recieved.

> status.rs:618
> +let (tx, rx) = mpsc::channel();
> +results.par_extend(walk_explicit(files, &dmap, root_dir, tx, options));
> +while let Ok(dir) = rx.recv() {

Instead of creating a vec then extending use collect.

> status.rs:619
> +results.par_extend(walk_explicit(files, &dmap, root_dir, tx, options));
> +while let Ok(dir) = rx.recv() {
> +if options.list_ignored || options.list_unknown && 
> !dir_ignore_fn(dir)

You are first collecting all of `results`, then collecting all of `work`. It 
looks like you intended to do this in parallel but I don't think that is what 
is happening.

A better approach is to make walk_explicit return both items and partition them 
here.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7929/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7929

To: Alphare, #hg-reviewers, kevincox
Cc: marmoute, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7927: rust-status: add util for listing a directory

2020-02-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> status.rs:76
> +
> +results.sort_by(|a, b| a.0.cmp(&b.0));
> +Ok(results)

It would be more clear to do `.sort_by_key(|e| e.0)`

And I don't think stability matters here so you could use 
`sort_unstable_by_key`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7927/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7927

To: Alphare, #hg-reviewers, marmoute, kevincox
Cc: marmoute, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8048: rust-dirstatemap: cache non normal and other parent set

2020-02-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> dirstate_map.rs:251
> +self.set_non_normal_other_parent_entries(false);
> +(&mut self.non_normal_set, &mut self.other_parent_set)
> +}

If we have just set the fields to `Some(..)` in the previous line can't we do 
the unwrap here where it is obviously correct?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8048/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8048

To: Alphare, #hg-reviewers, marmoute
Cc: marmoute, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7930: rust-status: update rust-cpython bridge to account for the changes in core

2020-02-10 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> status.rs:76
> +BadType::Unknown => "unknown",
> +}
> +)

This should probably be a method on BadType.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7930/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7930

To: Alphare, #hg-reviewers, kevincox
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8086: rust-status: refactor options into a `StatusOptions` struct

2020-02-10 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> status.rs:208
> +pub list_clean: bool,
> +}
> +

These names aren't very meaningful to me but maybe that is because I'm not 
super familiar with the domain.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8086/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8086

To: Alphare, #hg-reviewers
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8088: rust-status: add missing variants to `Dispatch` enum

2020-02-10 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> status.rs:288
>  /* TODO ignored
>   * TODO unknown */
>  }

TODO is done.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8088/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8088

To: Alphare, #hg-reviewers, kevincox
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7924: rust-matchers: add `build_regex_match` function

2020-02-10 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> matchers.rs:248
> +/// Returns a function that matches an `HgPath` against the regex formed by
> +/// the given patterns.
> +fn build_regex_match<'a>(

Can you describe the return value here? It seems odd that we return the regex 
and a matcher. It would be good to point that out.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7924/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7924

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7923: rust-matchers: add functions to get roots, dirs and parents from patterns

2020-02-10 Thread kevincox (Kevin Cox)
kevincox added a comment.


  Thanks for the heads up. I'll try to dedup them.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7923/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7923

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7921: rust-dirs-multiset: improve temporary error message

2020-01-24 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> dirs_multiset.rs:86
> +subpath.as_bytes(),
> +);
> +let second_slash_index = offset.unwrap() + subpath.len() - 1;

It appears to me that `find_dirs` always returns prefixes. Why is this offset 
required?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7921/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7921

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7923: rust-matchers: add functions to get roots, dirs and parents from patterns

2020-01-17 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> matchers.rs:250
> +/// This calculates the roots and directories exactly matching the patterns 
> and
> +/// returns a tuple of (roots, dirs) for each. It does not return other
> +/// directories which may also need to be considered, like the parent

I'm confused by the "for each" since this only returns one tuple.

> matchers.rs:311
> +ignore_patterns: &[IgnorePattern],
> +) -> PatternResult<(HashSet, HashSet, 
> HashSet)>
> +{

Please describe the return value.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7923/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7923

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7908: rust-filepatterns: improve API and robustness for pattern files parsing

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in filepatterns.rs:237
> I might be using godbolt wrong but it looks like you are using different 
> versions of rustc. When using 1.40 for both sides, I don't see any clear win 
> on either side... but I'm not fluent in assembly either. As I'm sure you'll 
> agree, this function is unlikely to show up in future profiles anyway.

Oops. My mistake. But I think we agree that readability is the top priority 
(until profiles say otherwise)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7908/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7908

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7909: rust-filepatterns: add support for `ignore` and `subignore` patterns

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in filepatterns.rs:462
> The `match` arms would be incompatible. Or maybe I'm missing something.

You would have to make sure you return a Vec on both arms. You will probably 
want `inner_pats.into_iter().collect()` on one branch and `vec![entry]` on the 
other branch.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7909/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7909

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7908: rust-filepatterns: improve API and robustness for pattern files parsing

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> filepatterns.rs:227
> +return acc;
> +}
> +if component != b".."

Can you move this condition into a `.filter()`?

> Alphare wrote in filepatterns.rs:237
> I'm not sure I agree. On the basis that "Rust might optimize iterators better 
> than for loops" I vote to keep it that way.

If you think it' is clearer that's fine. But I'm curious where "Rust might 
optimize iterators better than for loops" comes from. I did a quick test and it 
looks like the for loop actually generates slightly better code 
https://rust.godbolt.org/z/vFAWyZ

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7908/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7908

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7910: rust-re2: add wrapper for calling Re2 from Rust

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added a comment.


  While I think the rust regex crate is heavily inspired by RE2 and uses a 
fairly compatible syntax I don't think there is a goal to be 100% compatible 
and I would be surprised if there weren't any differences. What I am trying to 
say is that if we are going to change engines we may as well change to `regex`. 
If we are using `RE2` for backwards compatibility then this change makes sense.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7910/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7910

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7867: rust-hg-path: add useful methods to `HgPath`

2020-01-16 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> hg_path.rs:169
> +&mut self.inner
> +}
>  pub fn contains(&self, other: u8) -> bool {

I don't think we should have this method. It provides us no way to ensure that 
the invariants of this type are upheld. It is slightly more typing but I think 
we should recommend converting to a slice, doing the mutation, then converting 
back to `HgPath` then we can at least to validation (possibly only in debug 
builds) in the constructor.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7867/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7867

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7910: rust-re2: add wrapper for calling Re2 from Rust

2020-01-16 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> rust_re2.cpp:22
> +
> + return new RE2(StringPiece(data, len), o);
> + }

This is never freed. Should we add a `Drop` implementation in rust?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7910/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7910

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7910: rust-re2: add wrapper for calling Re2 from Rust

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  Does mercurial currently use RE2? If not then can you explain why we don't 
just use the rust `regex` crate?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7910/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7910

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added a comment.


  Maybe what we should do is keep the test I proposed, but change the 
definition of `BLOCK_SIZE` to be based upon the number of contained `i32` 
instead of `size_of`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added a comment.


  Ah yes. So my test would be ineffective.
  
  The problem that IIUC padding is not guaranteed to be initialized, and in 
rust all values need to be initialized. So if you cast a type with padding to a 
`u8` it is undefined behaviour.
  
  So we would need a test that shows that the type is the same size as the sum 
of the contained `i32`s.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7909: rust-filepatterns: add support for `ignore` and `subignore` patterns

2020-01-16 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> filepatterns.rs:456
> +.into_iter()
> +.filter_map(|entry| -> Option> {
> +let IgnorePattern {

I don't see anywhere where this returns `None`. Should this just be `.map()` or 
`.flat_map()`?

> filepatterns.rs:462
> +} = &entry;
> +let mut res = vec![];
> +match syntax {

I would write this as `Some(Ok(match syntax { .. }))` as the last expression of 
this map. Then instead of `res.extend()` and `res.push()` you just return the 
vecs.

> filepatterns.rs:525
> +p
> +})
> +})?,

How about:

  if !p.is_empty() {
p.push(b'/');
  }
  Ok(p)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7909/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7909

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7908: rust-filepatterns: improve API and robustness for pattern files parsing

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> filepatterns.rs:220
> +initial_slashes = 2;
> +}
> +let components =

It might be clearer to do `bytes.iter().take_while(|b| b == sep).count()`

> filepatterns.rs:237
> +acc
> +});
> +let mut new_bytes = components.join(&sep);

Personally I think this would be more clear as a `for` loop.

> filepatterns.rs:250
> +}
> +}
> +

Should this be part of HgPathBuf  or is this normalization specific to the file 
pattern matching in some way?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7908/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7908

To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7869: rust-dirs-multiset: add `DirsChildrenMultiset`

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> hg_path.rs:197
> +/// ```
> +pub fn split_at_suffix(&self) -> (&Self, &Self) {
> +match &self.inner.iter().rposition(|c| *c == b'/') {

"suffix" isn't clear to me. Maybe `split_dir_from_file`? That's a bit wordy but 
I feel that it gets the point across. Or `split_file_name`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7869/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7869

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7870: rust-utils: add `Escaped` trait

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  I think it's better. Maybe still a little vauge but I'll let subject experts 
make that decision.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7870/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7870

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7819: rust-nodemap: core implementation for shortest

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:337
>  /// Main working method for `NodeTree` searches
>  fn lookup<'p>(
>  &self,

Can you describe the return value in the doc comment.

> nodemap.rs:575
> +
> +fn shortest_bin<'a>(
> +&self,

How about `unique_bin_prefix_len`?

> nodemap.rs:766
>  
> +fn shortest_hex(
> +&self,

`unique_bin_prefix_len`?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7819/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7819

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:268
> This makes me uncomfortable. There is really no guarantee that there is no 
> padding around `Block` or `RawElement`. If there is any padding this is UB. I 
> would much prefer one of:
> 
> 1. Store growable as a set of bytes and convert on get/set.
> 2. Add a method that outputs the bytes to any `Write`. In theory this will be 
> slower but it is probably immaterial (especially if there is no padding).

I thought about this more and I think I am okay doing it this way. It seems 
like this should be well defined as long as there is no padding. However on 
that note I would want to add a check that there is no padding as expected. I 
would also like to ensure that this fails to compile if there is ever padding, 
even in release mode. I think this can be accomplished by something like:

  let _: [u8; 4 * BLOCK_SIZE] = std::mem::transmute([Block::new(); 4]);

I would probably want to repeat this check near any code that is relying on 
this invariant.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-01-16 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> nodemap.rs:268
> +Vec::from_raw_parts(
> +vec.as_ptr() as *mut u8,
> +vec.len() * BLOCK_SIZE,

This makes me uncomfortable. There is really no guarantee that there is no 
padding around `Block` or `RawElement`. If there is any padding this is UB. I 
would much prefer one of:

1. Store growable as a set of bytes and convert on get/set.
2. Add a method that outputs the bytes to any `Write`. In theory this will be 
slower but it is probably immaterial (especially if there is no padding).

> nodemap.rs:272
> +)
> +};
> +mem::forget(vec);

If someone panics between the `from_raw_parts` and `mem::forget` this is a 
double free. Right now I think this can't happen but it is completely possible 
that the code changes between then and now. I would prefer using 
`Vec::from_raw_parts` to make sure that there is only one `Vec` alive with that 
pointer at a time. This risks a leak in the face of panic but I think that is 
much better.

> nodemap.rs:433
> +amount: usize,
> +) -> Self {
> +assert!(buffer.len() >= offset + amount);

This is a weird API. Why does new take a buffer and an offset? Can it take 
either a slice or just remove the offset parameter and always have the buffer 
start at 0? It is very strange to be passing in data that we own but isn't ever 
used.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7795: rust-nodemap: insert method

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:268
> +/// itself because of the mutable borrow taken with the returned `Block`
> +fn mutable_block(&mut self, idx: usize) -> (usize, &mut Block, usize) {
> +let ro_blocks = &self.readonly;

It looks like this return value could use some abstraction but maybe its best 
to wait until there are more users?

> nodemap.rs:273
> +if idx < ro_len {
> +// TODO OPTIM I think this makes two copies
> +self.growable.push(ro_blocks[idx].clone());

I don't quite understand. You have the immutable copy and the mutable copy but 
that is WAI right?

> nodemap.rs:306
> +// visit_steps cannot be empty, since we always visit the root block
> +let (deepest_idx, mut nybble, rev_opt) = visit_steps.pop().unwrap();
> +let (mut block_idx, mut block, mut glen) =

nybble is very vague. Is this deepest_nibble or something?

> nodemap.rs:504
> +node_from_hex(&format!("{:0<40}", hex)).unwrap()
> +}
> +

Should these be `#[cfg(test)]`? We can always remove it later if there is a 
valid production use but it makes new users think twice.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7795/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7795

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7788: rust-node: binary Node and conversion utilities

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> node.rs:14
> +/// Binary revisions SHA
> +pub type Node = [u8; 20];
> +

I would recommend making this a proper type because almost no one should be 
poking at the bytes of the hash.

  struct Node([u8; 20]);

You can make the field `pub` if necessary.

> node.rs:31
> +for i in 0..20 {
> +node[i] = u8::from_str_radix(&hex[i * 2..i * 2 + 2], 16)?
> +}

I find progressing through the string easier to understand than this slicing. 
WDYT.

  for byte in node.iter_mut() {
  *byte = u8::from_str_radix(&hex[..2], 16)?;
  hex = &hex[2..];
  }

> node.rs:39
> +as_vec.join("")
> +}
> +

If we want to avoid a number of allocations we can do:

  pub fn node_to_hex(n: &Node) -> String {
  let mut r = String::with_capacity(n.len() * 2);
  for byte in n {
  write!(r, "{:02x}", byte).unwrap();
  }
  debug_assert_eq!(r.len(), n.len() * 2);
  r
  }

The generated code for `write!()` doesn't look great but if hex printing shows 
up in benchmarks it would be trivial to write a custom hex-formatter.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7788/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7788

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7794: rust-nodemap: generic NodeTreeVisitor

2020-01-16 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> nodemap.rs:254
> +impl<'n, 'p> Iterator for NodeTreeVisitor<'n, 'p> {
> +type Item = (bool, usize, u8, Option);
> +

I would prefer making this a struct so that you can name the fields. You can 
still pattern match the fields if you want to.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7794/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7794

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7793: rust-nodemap: mutable NodeTree data structure

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:158
>  readonly: Box + Send>,
> +growable: Vec,
> +root: Block,

This strikes me as a weird name. The fact that it is an adjective rather than a 
noun is a hint. Can you rename to answer "Growable what?"

> nodemap.rs:249
> +readonly, self.growable, self.root
> +)
>  }

I would use 
https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_struct for 
consistency unless you really want to avoid printing the struct name.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7793/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7792: rust-nodemap: abstracting the indexing

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:192
>  ) -> Result, NodeMapError> {
> -let blocks: &[Block] = &*self.readonly;
> -if blocks.is_empty() {
> +let len = self.len();
> +if len == 0 {

I don't think this variable helps readability. I would just repeat `self.len()`.

> nodemap.rs:193
> +let len = self.len();
> +if len == 0 {
>  return Ok(None);

I would add a `self.is_empty()` helper. It's good practice for anything that 
has a `.len()`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7792/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7792

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7791: rust-nodemap: NodeMap trait with simplest implementor

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:37
> +/// whose data should not be owned by the `NodeMap`.
> +pub trait NodeMap {
> +fn find_bin<'a>(

Can you please add doc-comments for this? I find that documenting trait methods 
is especially important.

> nodemap.rs:205
> +impl fmt::Debug for NodeTree {
> +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +let blocks: &[Block] = &*self.readonly;

I don't think the `<'_>` is necessary.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7791/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7791

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7864: rust-utils: add Rust implementation of Python's "os.path.splitdrive"

2020-01-16 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  One last note is why not just put this in HgPath?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7864/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7864

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7864: rust-utils: add Rust implementation of Python's "os.path.splitdrive"

2020-01-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in files.rs:109
> Indeed, this is much better. While trying to adapt this code to fit with 
> `HgPath`, I find myself needing to translate to and from bytes whenever 
> indexing or when using `split_at`. Should we give a `HgPath` a `split_at` 
> method or also all the `Index<>` ones? I remember that we decided against 
> that earlier.

I would recommend just converting to bytes at the top of the function then 
converting the return value to a path at the exit. I feel when you are doing 
manipulation like this it makes the most sense to treat it as plain bytes 
within the function.

Alternatively I wouldn't mind putting an index operator but have a slight 
preference for `path.as_bytes()[n]` to keep it explicit as most of the code 
shouldn't be reaching into paths.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7864/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7864

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7866: rust-pathauditor: add Rust implementation of the `pathauditor`

2020-01-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> path_auditor.rs:53
> +.as_bytes()
> +.split(|c| *c as char == std::path::MAIN_SEPARATOR)
> +.collect();

Should this be `std::path::is_separator(*c as char)`?.

If not please add a comment explaining why.

> path_auditor.rs:54
> +.split(|c| *c as char == std::path::MAIN_SEPARATOR)
> +.collect();
> +if !split_drive(path).0.is_empty()

It would be nice to have this in a helper function in path to get a component 
iterator.

> path_auditor.rs:72
> +let last = split.next().unwrap();
> +if last.iter().all(|b| (*b as char).is_digit(10))
> +&& [&b"HG"[..], &b"HG8B6C"[..]].contains(&first.as_ref())

You can just use 
https://doc.rust-lang.org/std/primitive.u8.html#method.is_ascii_digit

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7866/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7866

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7870: rust-utils: add `PrettyPrint` trait

2020-01-15 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  I'm not convinced PrettyPrint is the best name. I might call it something 
more to do with escaping instead of "pretty". However I can't think of anything 
great.

INLINE COMMENTS

> utils.rs:146
> +acc.push(HEX_DIGITS[((*c & 0xf0) >> 4) as usize]);
> +acc.push(HEX_DIGITS[(*c & 0xf) as usize]);
> +}

write!(acc, "\\x{:x}", self).unwrap();

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7870/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7870

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7869: rust-dirs-multiset: add `DirsChildrenMultiset`

2020-01-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> dirs_multiset.rs:170
> +Some(i) => i.contains(&directory),
> +} {
> +continue;

I would put this check into a helper function.

> files.rs:86
> +)),
> +},
> +None => None,

This should probably be a helper on `HgPath`? It would be much easier to 
understand what it is doing if it had a name.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7869/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7869

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7867: rust-hg-path: add useful methods to `HgPath`

2020-01-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> hg_path.rs:189
> +&self.inner[..]
> +};
> +HgPath::new(match inner.iter().rposition(|b| *b == b'/') {

It would be nice to have a `trim_trailing_slash` helper.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7867/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7867

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7864: rust-utils: add Rust implementation of Python's "os.path.splitdrive"

2020-01-15 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> files.rs:109
> +/// Paths cannot contain both a drive letter and a UNC path.
> +pub fn split_drive(path: impl AsRef) -> (HgPathBuf, HgPathBuf) {
> +let path = path.as_ref();

I think this can be simplified. See 
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=78136ead96596afe6305e7542a881ca4.
 I had to use &[u8] to avoid having to copy all of the HgPath stuff into the 
playground, but the code should be easy to integrate.

Notable changes:

- Avoid creating `norm_bytes` vector (and allocation) by creating an is_sep 
function.
  - Note, we can probably use std::path::is_separator 

- Return references to the input since we guarantee that the output will be 
parts of the input. The caller can always clone.
- Use slice::split_at to make it obvious that we are returning the entire path 
just split.
- Use pattern matching rather than unwrapping.
- Use fall-through returns to make it obvious we handle every path.

> files.rs:230
> +);
> +}
> +

We should add an example of a UNC and Drive path here to ensure that they don't 
get split.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7864/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7864

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7787: rust-nodemap: building blocks for nodetree structures

2020-01-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> nodemap.rs:92
> +
> +fn read(&self, nybble: u8) -> Element {
> +Element::from(RawElement::from_be(self.0[nybble as usize]))

I would call these `get` and `set`.

> nodemap.rs:111
> +}
> +write!(f, "[{}]", inner.join(", "))
> +}

You can use this helper: 
https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_map

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7787/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7787

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7604: rust-hg-path: implement more readable custom Debug for HgPath{, Buf}

2019-12-11 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> hg_path.rs:186
> +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +write!(f, "HgPath({})", String::from_utf8_lossy(&self.inner))
> +}

I would recommend `"HgPath({:?})"` so that the result is unambiguous and nicely 
escaped.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7604/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7604

To: martinvonz, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7577: hg-core: implement Mercurial's config file discovery logic

2019-12-09 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> config.rs:145
> +let paths = glob_res
> +.filter_map(|x| if let Ok(x) = x { Some(x) } else { None })
> +.sorted();

I believe this can just be `.flatten()`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7577/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7577

To: indygreg, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7529: rust-dirstate-status: add `walk_explicit` implementation, use `Matcher` trait

2019-12-02 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> status.rs:125
> +/// TODO subrepos
> +fn walk_explicit<'a: 'c, 'b: 'c, 'c>(
> +files: &'a HashSet<&HgPath>,

It seems to me that you can just use one lifetime here since the two input 
limit the output.

https://rust.godbolt.org/z/57bf6T But maybe I over-simplified the problem.

> status.rs:168
> +return Some(Ok((normalized, Dispatch::Removed)));
> +}
> +}

Isn't this the same as the first case? Can we just remove the first case?

> status.rs:244
> +results: impl IntoIterator>,
> +) -> IoResult<(Vec<&'a HgPath>, StatusResult<'a>)> {
>  let mut lookup = vec![];

It surprises me this annotation is required? I thought this would be the 
assumed meaning.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7529/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7529

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7526: rust-hg-path: add method to get part of a path relative to a prefix

2019-12-02 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> hg_path.rs:131
> +let base = base.as_ref();
> +if base.len() == 0 {
> +return Some(self);

`base.is_empty()`? If we don't have that helper we should probably add it.

> hg_path.rs:131
> +let base = base.as_ref();
> +if base.len() == 0 {
> +return Some(self);

The comment says that `base` must end with a `/` however you appear to handle 
cases where it doesn't Can you either update the documentation or the code?

> hg_path.rs:134
> +}
> +let is_dir = base.len() > 0 && base.as_bytes().last() == Some(&b'/');
> +if is_dir && self.starts_with(base) {

Instead of using `.last()` can we add and use `base.ends_with(b'/')`?

> hg_path.rs:134
> +}
> +let is_dir = base.len() > 0 && base.as_bytes().last() == Some(&b'/');
> +if is_dir && self.starts_with(base) {

`base.len() > 0` is redundant both with the early return above and because 
`last()` will return `None` if `base` is empty.

> hg_path.rs:215
> +fn eq(&self, other: &HgPath) -> bool {
> +self.inner == other.as_bytes()
> +}

I would do `self.as_bytes()` for consistency.

Or even better I believe you can `#[derive(Eq,PartialEq)]`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7526/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7526

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7525: rust-matchers: improve `Matcher` trait ergonomics

2019-12-02 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> matchers.rs:87
> +None
>  }
>  fn exact_match(&self, _filename: impl AsRef) -> bool {

Another option here would just to have a global empty hash set.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7525/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7525

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7523: rust-hg-path: implement `Display` for `HgPath` and `HgPathBuf`

2019-12-02 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> hg_path.rs:168
> +fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +write!(f, "HgPath {{ {} }}", String::from_utf8_lossy(&self.inner))
> +}

This looks more like a `Debug` format than a `Display` format. For `Display I 
would just do `write!(f, "{}", String::from_utf8_lossy(&self.inner))`

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7523/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7523

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7300: rust-status: refactor dispatch case for normal files

2019-11-07 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> status.rs:31
>  
>  #[inline]
> +/// Dates and times that are outside the 31-bit signed range are compared

I see a lot of these and in general I would recommend against it. The compiler 
is smart. Unless profiling shows an advantage in a specific case I would leave 
them out.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7300/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7300

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7299: rust-status: return a ParallelIterator instead of a Vec from stat_dmap_entries

2019-11-07 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> status.rs:128
> +Err(e) => return Err(e.into()),
> +};
>  

This can be replaced with:

  let meta = 
root_dir.as_ref().join(hg_path_to_path_buf(filename)?).symlink_metadata();

Although I would probably also put `hg_path_to_path_buf(filename)?` into a 
variable for readability.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7299/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7299

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7254: rust-status: improve status performance

2019-11-06 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> status.rs:41
> +last_normal_time: i64,
> +) -> (HgPathBuf, Dispatch) {
> +let filename = filename.as_ref();

Every exit path is just `filename.to_owned()`. It would be better to just let 
the caller do that if required and return only the `Dispatch`.

> status.rs:64
> +// of detecting changes. (issue2608)
> +let range_mask = 0x7fff;
> +

I would create a helper function.

  fn mod_compare(a: i32, b: impl Into) -> bool {
let b  = b.into();
a != b && a != b & range_mask
  }

Also I think the condition is easier to read as:

  a & i32::max_value() != b & i32::max_value()

> status.rs:71
> +if size >= 0
> +&& (size_changed || mode_changed)
> +|| size == -2  // other parent

I would break this out into another local.

  let metadata_changed = size >= 0 && (size_changed || mode_changed);
  let other_parent = size == -2;
  if metadata_change || other_parent || copy_map.contains_key(filename)

As a bonus this also makes the &&/|| precedence obvious.

> status.rs:72
> +&& (size_changed || mode_changed)
> +|| size == -2  // other parent
> +|| copy_map.contains_key(filename)

Can we put this magic number in a constant anywhere?

> status.rs:105
> +state: EntryState,
> +) -> (HgPathBuf, Dispatch) {
> +let filename = filename.as_ref();

Same, we don't need to return the path.

> status.rs:127
> +last_normal_time: i64,
> +) -> std::io::Result> {
>  dmap.par_iter()

It would be best to return the iterator here, this way we don't need to collect 
into a `Vec` just to throw it away.

You can do the final sorting using a parallel `for_each`. However even if we 
end up doing a collection into Vec we can do that at the place of use rather 
than here which forces it upon all callers.

> status.rs:133
> +|(filename, entry)| -> Option<
> +std::io::Result<(HgPathBuf, Dispatch)>
>  > {

Unless I am missing something this function always returns `Some(_)`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7254/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7254

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7178: [RFC] rust-matchers: add `Matcher` trait and implement `AlwaysMatcher`

2019-10-30 Thread kevincox (Kevin Cox)
kevincox added a comment.


  I see a lot of the functions are here to give optimization hints. In order to 
make someone non-familiar with the code able to understand it each method 
should state the contracts that it is making. I am having a really hard time 
reconciling how the different functions interact and which methods have 
precedence over each other.

INLINE COMMENTS

> matchers.rs:28
> +fn exact_match(&self, _filename: impl AsRef) -> bool;
> +fn file_set(&self) -> Option> {
> +None

Please document.

> matchers.rs:31
> +}
> +fn roots(&self) -> Option> {
> +None

Please document.

> martinvonz wrote in matchers.rs:104
> Should this be `is_always()` per Rust naming conventions?

always isn't an adjective so I don't think that really makes sense. How about 
`matches_everything`.

> matchers.rs:104
> +/// optimization might be possible.
> +fn always(&self) -> bool {
> +false

It is probably worth noting that false-negatives are fine but false-positives 
will lead to incorrect behaviour.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7178/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7178

To: Alphare, #hg-reviewers
Cc: martinvonz, spectral, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7116: rust-performance: introduce FastHashMap type alias for HashMap

2019-10-19 Thread kevincox (Kevin Cox)
kevincox added a comment.


  In D7116#104769 , @Alphare wrote:
  
  > The following comparison shows that for input > 20 bytes, fnv has worse 
overall performance than xx
  
  Sounds good. We can always do benchmarks in the future and swap it.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7116/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7116

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7116: rust-performance: introduce FastHashMap type alias for HashMap

2019-10-16 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  I've seen very good results with https://github.com/servo/rust-fnv in the 
past so it is probably worth including that in the comparison and possibly 
using it. It is especially good for small keys which seems like a common case 
in hg.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7116/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7116

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7116: rust-performance: introduce FastHashMap type alias for HashMap

2019-10-16 Thread kevincox (Kevin Cox)
kevincox added a comment.


  In D7116#104617 , @durin42 wrote:
  
  > OOC, have you compared this with the hashbrown crate for perf?
  
  std::collections::HashMap now uses hashbrown 
https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7116/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7116

To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7058: rust-dirstate-status: add first Rust implementation of `dirstate.status`

2019-10-15 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  Thanks, looks good.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7058/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7058

To: Alphare, #hg-reviewers, kevincox
Cc: yuja, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7058: rust-dirstate-status: add first Rust implementation of `dirstate.status`

2019-10-15 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added inline comments.
kevincox requested changes to this revision.

INLINE COMMENTS

> Alphare wrote in status.rs:83
> Collecting into a `Result` prevents us from doing a `filter_map` during 
> `collect`. Implementing a `ParallelIterator` adapter that would do a 
> `filter_map` with `Result` handling is possible, but would require more work.
> 
> In this specific case, as I eluded to above, I just removed the sentinel 
> value so the question is not relevant anymore.

I don't follow. can't this `.filter_map(...)` be replaced with a 
`.filter(...).map(...)`? I think this would make the code much more clear.

> status.rs:23
> +fn walk_explicit(
> +files: Vec,
> +dmap: &mut DirstateMap,

This doesn't appear to be consumed by the function so you should take a 
`&[HgPathBuf]`. Or since I don't think you use the path either probably a 
`&[HgPath]`.

> status.rs:134
> +results: HashMap>,
> +) -> ((Vec, StatusResult)) {
> +let mut lookup = vec![];

Why are there two pairs of parens?

> status.rs:168
> +_ => {}
> +},
> +Some(HgMetadata {

This seems like a very complicated way to write this. Why not separate the 
`match state` and do it first. Or put it in an `else` block? Furthermore why do 
you have two separate `match state`? Why not handle them all in one `match`?

> status.rs:240
> +let mut results =
> +walk_explicit(files, &mut dmap, root_dir.as_ref().to_owned())?;
> +

Do you need to do this to `root_dir`? The `walk_explicit` function seems to 
only need `AsRef` so you should be able to just pass it. You might need 
to update the argument to be a reference.

> files.rs:112
> +// TODO support other platforms
> +unimplemented!()
> +}

Isn't it better to avoid implementing this so that we get a compile error 
rather than a runtime error?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7058/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7058

To: Alphare, #hg-reviewers, kevincox
Cc: yuja, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6773: rust-hgpath: add HgPath and HgPathBuf structs to encapsulate handling of paths

2019-09-13 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> hg_path.rs:175
> +pub fn push(&mut self, byte: u8) {
> +self.inner.push(byte);
> +}

debug_assert check after the mutation.

> hg_path.rs:242
> +fn extend>(&mut self, iter: T) {
> +self.inner.extend(iter);
> +}

debug_assert check after the mutation.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6773/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6773

To: Alphare, #hg-reviewers, kevincox
Cc: mjpieters, yuja, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6774: rust-hgpath: replace all paths and filenames with HgPath/HgPathBuf

2019-09-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> files.rs:75
> +/// TODO more than ASCII?
> +pub fn normalize_case(path: &HgPath) -> HgPathBuf {
> +#[cfg(windows)] // NTFS compares via upper()

Is this used? Can you either use, delete or add tests.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6774/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6774

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6773: rust-hgpath: add HgPath and HgPathBuf structs to encapsulate handling of paths

2019-08-30 Thread kevincox (Kevin Cox)
This revision now requires changes to proceed.
kevincox added a comment.
kevincox requested changes to this revision.


  It looks good overall. I just would like to have a bit more strict definition 
of what an HgPath can contain and preferably some init-time validation.

INLINE COMMENTS

> hg_path.rs:19
> +/// decoded from MBCS to WTF-8. If WindowsUTF8Plan is implemented, the source
> +/// character encoding will be determined per repository basis.
> +#[derive(Eq, Ord, PartialEq, PartialOrd, Debug, Hash)]

You should probably add a note about what a valid path is. My initial 
assumption is "A sequence of non-null characters. `/` separates directories, 
consecutive `/` are forbidden". It would be good to write this down, provide an 
`is_valid` function and add a debug_assert! on creation/modification to check 
that everything is as expected.

> hg_path.rs:43
> +}
> +pub fn iter(&self) -> HgPathIterator {
> +HgPathIterator { path: &self }

Please call this `bytes()` to highlight that it iterates over bytes, not 
characters or directory components.

> hg_path.rs:136
> +#[derive(Debug)]
> +pub struct HgPathIterator<'a> {
> +path: &'a HgPath,

I would also call this `HgPathByteIterator`

> hg_path.rs:204
> +
> +#[inline]
> +fn deref(&self) -> &HgPath {

This is almost certainly unnecessary.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6773/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6773

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6765: rustfilepatterns: shorter code for concatenating slices

2019-08-28 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> valentin.gatienbaron wrote in filepatterns.rs:161
> Indeed, although it took a compiler upgrade.
> I don't really understand why the compiler accepts the change here but not in 
> the other places. It seems a bit random to remove the borrow only here. What 
> do you think about the next commit? It would remove the `&_[..]` more 
> predictably.

For the other ones I think the problem is that it gets the element type from 
the first element as `&Vec`. I think a better approach then creating the 
wrapper function is doing `[vec.as_slice(), b"foobar"].concat()`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6765/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6765

To: valentin.gatienbaron, #hg-reviewers, Alphare, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial-devel@mercurial-scm.org

2019-08-28 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> utils.rs:44
>  fn trim(&self) -> &Self;
> +fn chop_prefix(&self, needle:&Self) -> Option<&Self>;
>  }

I have a small preference for `drop_prefix`.

> utils.rs:44
>  fn trim(&self) -> &Self;
> +fn chop_prefix(&self, needle:&Self) -> Option<&Self>;
>  }

Put a space in `needle: &Self`

> utils.rs:86
> +
> +fn chop_prefix(&self, needle:&Self) -> Option<&Self> {
> +if self.starts_with(needle) {

Put a space in needle: &Self

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6766/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6766

To: valentin.gatienbaron, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6765: rustfilepatterns: shorter code for concatenating slices

2019-08-28 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> filepatterns.rs:161
>  }
> -let mut res = b".*".to_vec();
> -res.extend(pattern);
> -res
> +[&b".*"[..], pattern].concat()
>  }

I don't think you need the `&_[..]`. https://rust.godbolt.org/z/Wo-vza

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6765/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6765

To: valentin.gatienbaron, #hg-reviewers, Alphare, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6631: rust-cpython: add macro for sharing references

2019-08-12 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  In D6631#97887 , @Alphare wrote:
  
  > @kevincox I've renamed the file since it no longer just contains macros.
  > I've moved whatever I could move to separate structs and used `Cell` 
instead of `RefCell` on those scalar types. I'm not so keen on using 
`UnsafeCell`, espacially since the target `py_class!`-resulting struct could be 
using `RefCell`-specific APIs.
  
  Ack. it still feels like we are kinda implementing a second `RefCell` anyways 
but I think this code is quite clean.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6631/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6631

To: Alphare, #hg-reviewers, kevincox
Cc: gracinet, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6516: rust-discovery: accept the new 'respectsize' init arg

2019-07-22 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  It seems weird to add this argument without using it. I would much rather it 
was added as the implementation was added.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6516/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6516

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6632: rust-dirstate: rust implementation of dirstatemap

2019-07-22 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> dirstate_map.rs:191
> +changed = true;
> +*entry = DirstateEntry {
> +mtime: MTIME_UNSET,

Can you just do `entry.mtime = MTIME_UNSET`?

> dirstate_map.rs:288
> +pub fn set_parents(&mut self, parents: DirstateParents) {
> +self.parents = Some(parents.clone());
> +self.dirty_parents = true;

This `.clone()` looks unnecessary. If it is necessary you should probably 
accept a `&DirstateParents`.

> dirstate_map.rs:306
> +
> +if !self.dirty_parents {
> +self.set_parents(parents.to_owned());

It isn't clear to me why we don't `set_parents` if `dirty_parents`. This would 
make me think that `parents` is lazily calculated however `set_parents` sets 
`parents` and `dirty_parents` which would imply that this is not the case.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6632/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6632

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6629: rust-dirstate: use EntryState enum instead of literals

2019-07-17 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in lib.rs:85
> Did you mean `{:?}`. I can't find anything related to your syntax.

Sorry yes. I've been writing too much python at work.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6629/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6629

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6631: rust-cpython: add macro for sharing references

2019-07-16 Thread kevincox (Kevin Cox)
kevincox added a comment.


  In D6631#97216 , @kevincox wrote:
  
  > Is there any reason this can't be done using `Rc>`?  
I have an example here: https://rust.godbolt.org/z/MNNR_F
  
  I see now that since python is managing the reference it does make some sense 
to do it this way.

INLINE COMMENTS

> macros.rs:55
> +$data_member: ident,
> +$leaked: ident
> +) => {

The terminology is weird here. You don't appear to be tracking leaks, just 
borrows.

> gracinet wrote in macros.rs:61
> Hi Kevin,
> 
> all that CPython's `del` does is decrement the reference count, and if it 
> drops to 0, then the memory will be freed immediately. On the other hand, the 
> `clone_ref()` at line 106 does increment it.
> 
> So, in your example, after `del c` the refcount of `c` is 1 and the memory 
> shouldn't be freed.

Thanks. I missed that.

> macros.rs:85
> + immutable references in Python objects",
> +)),
> +}

This catches a mutable borrow after other borrows, however I believe it fails 
to catch a mutable borrow followed by immutable borrows.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6631/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6631

To: Alphare, #hg-reviewers
Cc: gracinet, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6631: rust-cpython: add macro for sharing references

2019-07-16 Thread kevincox (Kevin Cox)
kevincox added a comment.


  Is there any reason this can't be done using `Rc>`?  I 
have an example here: https://rust.godbolt.org/z/MNNR_F
  
  It uses `Rc` and `RefCell` however it does need to do some unsafe work to 
keep the RefCell borrowed for longer than would otherwise be easy to do. The 
advantages here are
  
  1. Lifetime is automatically managed by the Rc. This ensures that our parent 
data doesn't get deleted while the iterator is still alive.
  2. The linked implementation uses a mutable borrow. However it would be easy 
to make an immutable version which would allow multiple iterators over the same 
"container".
  3. Mutable access is managed by the RefCell so we don't need to do our own.

INLINE COMMENTS

> macros.rs:45
> +/// data inner: RefCell;
> +/// data leak_count: RefCell;
> +/// });

You aren't really using the `RefCell` type here. It might be better to just use 
`Cell` for the count and `UnsafeCell` for the data.

> macros.rs:57
> +) => {
> +impl $name {
> +fn leak_immutable(&self, py: Python) -> &'static $inner_struct {

Instead of adding methods to the interface type I would just create a templated 
struct which has these methods as well as the data and count. It doesn't seem 
like these actually need to be in the parent type.

> macros.rs:61
> +*self.leak_count(py).borrow_mut() += 1;
> +unsafe { &*ptr }
> +}

I'm failing to see what actually prevents the "container" from being dropped 
while the "iterator" is alive. Is this somehow tied to the Python GC?

For example:

  c = RustObject()
  i = iter(i)
  del c
  # What is keeping the backing rust memory alive at this point.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6631/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6631

To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6628: rust-parsers: switch to parse/pack_dirstate to mutate-on-loop

2019-07-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> mod.rs:16
> +pub p1: Vec,
> +pub p2: Vec,
>  }

If these are always 20 bytes long you are best to use `[u8; 20]`. That way 
there will be no indirection. And since `Vec` is 3 words it will have a bigger 
immediate size on 64bit platforms.

> parsers.rs:81
>  parents: DirstateParents,
> -now: i32,
> -) -> Result<(Vec, DirstateVec), DirstatePackError> {
> +now: Duration,
> +) -> Result, DirstatePackError> {

The "proper" solution is SystemTime 
 however I will 
admit that it is a bit more awkward to use. However if you are going to use 
Duration please document that it is the Duration since the Unix Epoch.

> parsers.rs:87
>  
> -let expected_size: usize = dirstate_vec
> +let now = now.as_secs() as i32;
> +

Would it be best to panic on overflow? 
`now.as_secs().try_into::().expect("time overflow")`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6628/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6628

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6629: rust-dirstate: use EntryState enum instead of literals

2019-07-15 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> lib.rs:85
> +Overflow => "Overflow in dirstate.".to_string(),
> +CorruptedEntry(e) => format!("Corrupted entry: {}.", e),
> +Damaged => "Dirstate appears to be damaged.".to_string(),

I like using `{!r}` so that the substituted entry is quoted and unambiguous.

> parsers.rs:42
> +// platform-specific `c_char`.
> +let state: u8 = entry.state.into();
> +

If you are going to a comment I would say why. However in this case it probably 
isn't worth having a comment.

You can also do this on one line `entry.state.into::() as c_char` which I 
think is simple enough.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6629/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6629

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6626: rust-dirstate: create dirstate submodule in hg-cpython

2019-07-12 Thread kevincox (Kevin Cox)
kevincox added inline comments.
kevincox accepted this revision.

INLINE COMMENTS

> dirs_multiset.rs:34
> +skip_state = Some(skip.extract::(py)?.data(py)[0] as 
> i8);
> +}
> +let dirs_map;

let skip_state = skip.map(|skip| skip.extract::(py)?.data(py)[0] as 
i8);

> dirs_multiset.rs:35
> +}
> +let dirs_map;
> +

let dirs_map = if ...

> dirs_multiset.rs:68
> +)
> +.and(Ok(py.None()))
> +.or_else(|e| {

.map(py.None())

> mod.rs:119
> +let state = stats.get_item(py, 0)?.extract::(py)?;
> +let state = state.data(py)[0] as i8;
> +let mode = stats.get_item(py, 1)?.extract(py)?;

Should there be a check that this element exists?

> mod.rs:178
> +},
> +) in new_dirstate_vec
> +{

I think I would find it more readable as `for (filename, dirstate) in ...`. If 
you want to unpack it I would do that in the first line of the loop body. 
However it might also be best just to use `dirstate.state`. I guess the 
downside there is you don't get a compile error if a new field is added.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6626/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6626

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6594: RFC dirstatemap

2019-07-10 Thread kevincox (Kevin Cox)
kevincox added a comment.


  In D6594#96518 , @yuja wrote:
  
  > Yes, so we'll probably want to minimize the data exchanged between Rust
  > and Python. This means we'll eventually move more logic to Rust side, which
  > hopefully reduce the number of elements to be exposed through the iterator
  > interface. Until then, it might be better to move iterator-heavy objects
  > to CPython/PyO3 layer. Just my guess.
  
  This is likely true. But I suspect we don't need to worry too much unless 
performance is shown to be an issue. Then it makes sense to make 
spot-optimizations.
  
  Especially if there are plans to move more code into Rust over time it will 
be very easy to spend a lot of time optimizing the interface only to find that 
the interface has moved a layer up in 4 months.

INLINE COMMENTS

> dirstate_map.rs:275
> +let parents;
> +if file_contents.len() == 40 {
> +parents = DirstateParents {

Give this magic number a name?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6594/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6594

To: Alphare, #hg-reviewers
Cc: yuja, durin42, kevincox, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6593: rust-minor-fixes: remove Deref in favor of explicit methods

2019-07-10 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> dirs_multiset.rs:131
> +
> +pub fn iter(&self) -> Iter, u32> {
> +self.inner.iter()

Returning a `std::collections::hash_map::Iter` binds you to this 
implementation. I would recommend instead returning `impl Iterator`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6593/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6593

To: Alphare, #hg-reviewers
Cc: yuja, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6597: rust-2018: switch hg-core and hg-cpython to rust 2018 edition

2019-07-10 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  Is the SliceExt change related to the 2018 change? If not could you split the 
two?

INLINE COMMENTS

> utils.rs:3
> +
> +pub fn replace_slice(buf: &mut [T], from: &[T], to: &[T])
> +where

This could really use a doc comment.

> utils.rs:7
> +{
> +if buf.len() < from.len() || from.len() != to.len() {
> +return;

from.len() != to.len() sounds like a bug and I would probably use an 
`assert!()`. Unless this is expected to be common and "okay".

> utils.rs:12
> +if buf[i..].starts_with(from) {
> +buf[i..(i + from.len())].clone_from_slice(to);
> +}

This allows overlapping replacements. I don't know if this is intended.

> utils.rs:38
> +}
> +fn trim_end(&self) -> &[u8] {
> +if let Some(last) = self.iter().rposition(is_not_whitespace) {

Why not define `trim` as `self.trim_start().trim_end()`?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6597/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6597

To: Alphare, #hg-reviewers, kevincox
Cc: pulkit, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6594: RFC dirstatemap

2019-07-04 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> dirstate_map.rs:47
> +
> +impl Default for DirstateMap {
> +fn default() -> Self {

Can you just `#[derive(Default)]`?

> dirstate_map.rs:95
> +filename: &[u8],
> +old_state: u8,
> +entry: DirstateEntry,

This should probably be replaced with an enum. You can implement `TryFrom` 
for ease of conversion from python.

> dirstate_map.rs:227
> +{
> +if *state != b'n' || *mtime == -1 {
> +non_normal.insert(filename.to_owned());

It would probably be good to move these sential values into well-named 
constants. That or add comments explaining what they mean in each case.

> dirstate_map.rs:237
> +}
> +pub fn set_all_dirs(&mut self) -> () {
> +if self.all_dirs.is_none() {

Drop the `-> ()`

> dirstate_map.rs:245
> +}
> +pub fn set_dirs(&mut self) -> () {
> +if self.dirs.is_none() {

This function seems quite odd. Why isn't dirs always set? It seems like you do 
some initialization then set it. However does this mean that this class has a 
different set of allowed methods before and after dirs is set?

If this is the case you should either A) Add assertions at the top of every 
method to ensure they are only called at allowed times or B) split out the 
initialization into a builder type.

Otherwise why can't `dirs` be set at initialization.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6594/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6594

To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6393: rust-dirstate: add "dirs" Rust implementation

2019-06-28 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> martinvonz wrote in dirs_multiset.rs:43
> Maybe Kevin meant something like this?
> 
>   if skip_state == None | skip_state == Some(state) {
>   multiset.add_path(filename);
>   }

No, you are right. I misread the condition.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6393/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6393

To: Alphare, #hg-reviewers, kevincox
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6393: rust-dirstate: add "dirs" Rust implementation

2019-06-24 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  Sorry. I was busy. In general don't worry about blocking on me, I can't 
promise any sort of reasonable response time. Worst case I can review the 
changes after submission and changes can be made afterwards.

INLINE COMMENTS

> dirs_multiset.rs:43
> +if let Some(skip) = skip_state {
> +if skip != state {
> +multiset.add_path(filename);

You can replate the nested if with:

  if skip_state == Some(state) {

> dirs_multiset.rs:63
> +/// without trailing slash, from right to left.
> +fn find_dir(path: &[u8], mut pos: usize) -> Option<&[u8]> {
> +loop {

I would remove the `pos` argument. IIUC following two are currently identical.

  find_dir(path, n);
  find_dir(path[..n], n);

If you remove the second argument then you can just always remove the last 
component of the path. This will also allow you to have a more clear doc 
comment. Right now I find it a little confusing.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6393/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6393

To: Alphare, #hg-reviewers, kevincox
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6429: rust-discovery: optimization of add commons/missings for empty arguments

2019-06-13 Thread kevincox (Kevin Cox)
kevincox added a comment.
kevincox accepted this revision.


  Thanks, looks good.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6429/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6429

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6394: rust-dirstate: add "dirs" rust-cpython binding

2019-06-10 Thread kevincox (Kevin Cox)
kevincox accepted this revision.
kevincox added inline comments.

INLINE COMMENTS

> dirstate.rs:221
> +}
> +let dirs_map;
> +

let dirs_map = if ...

Then just let the if statement evaluate to this value.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6394

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6428: rust-discovery: using the children cache in add_missing

2019-06-10 Thread kevincox (Kevin Cox)
kevincox accepted this revision.
kevincox added inline comments.

INLINE COMMENTS

> discovery.rs:233
> +/// sets, for which the discovery process will engage into sampling
> +/// that needs the children cache anyway.
>  pub fn add_missing_revisions(

These added comments seem more relevant to the implementation of the method 
than they are to the caller. Would it make sense to move them inside of the 
function body so that they don't busy the doc-comment?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6428

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6429: rust-discovery: avoid useless calls to addcommons/addmissings

2019-06-10 Thread kevincox (Kevin Cox)
kevincox requested changes to this revision.
kevincox added a comment.
This revision now requires changes to proceed.


  Wouldn't it be better to make `add_missing_revisions` work properly on empty 
inputs? It seems like it is too error prone to try and catch every caller. If 
possible I would like to push this check down as far as possible.
  
  It seems easy enough to do in 
https://www.mercurial-scm.org/repo/hg/file/tip/rust/hg-core/src/discovery.rs#l56
 by checking if `common` grew.  This has the added benefit that attempting to 
re-add bases will also skill the work.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6429

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6389: rust-dirstate: create dirstate submodule

2019-05-24 Thread kevincox (Kevin Cox)
kevincox accepted this revision.
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in mod.rs:7
> The current Python public API expects `parents` as return value and does 
> mutate-on-parse for other in-memory objects. Sorry if I'm missing something 
> obvious.

Ok, I'm still not convinced this is the best API to expose but if it is a 
public API I guess it can't be changed.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6389

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6389: rust-dirstate: create dirstate submodule

2019-05-24 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> Alphare wrote in mod.rs:7
> This is public so that `hg-cpython` can use it. I think it helps with 
> readability too, a "bag of bytes" is sometimes not really helpful when 
> stumbling on code.

I guess the point I am trying to make is why do we pass around this 
"half-parsed" object. Why can't the public API go straight from bytes -> fully 
parsed.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6389

To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6424: rust-discovery: takefullsample() core implementation

2019-05-23 Thread kevincox (Kevin Cox)
kevincox accepted this revision.
kevincox added inline comments.

INLINE COMMENTS

> discovery.rs:124
> +fn next(&mut self) -> Option {
> +if self.cur > 1 {
> +return None;

I think it would be nice to remove the magic number.

  if self.cur >= self.parents.len()

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6424

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6389: rust-dirstate: create dirstate submodule

2019-05-17 Thread kevincox (Kevin Cox)
kevincox added inline comments.

INLINE COMMENTS

> mod.rs:6
> +pub p1: &'a [u8],
> +pub p2: &'a [u8],
> +}

If 1 and 2 are the best names why not just make it `pub struct 
DirstateParetens([&[u8]; 2])`?

> mod.rs:7
> +pub p2: &'a [u8],
> +}
> +

It seems odd that this struct is public seeing as it is just a bag of bytes. 
Would it make sense to keep this internal to the parser/serializer?

> parsers.rs:40
> +
> +let mut cursor = Cursor::new(entry_bytes);
> +let state = cursor.read_i8()?;

It seems to me that this code would be simpler if you constructed a single 
cursor at the top of the function and just used it to track your state instead 
of having a couple of other variables to track it.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6389

To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6348: rust-dirstate: add rust implementation of `parse_dirstate` and `pack_dirstate`

2019-05-10 Thread kevincox (Kevin Cox)
kevincox requested changes to this revision.
kevincox added inline comments.
This revision now requires changes to proceed.

INLINE COMMENTS

> dirstate.rs:16
> +pub type DirstateVec = Vec<(Vec, DirstateTuple)>;
> +pub type CopyMap<'a> = Vec<(&'a [u8], &'a [u8])>;
> +

You have a lot of tuples here an it is unclear what they represent. I would 
much prefer making structs for these, then you can give the members names which 
will really help the reader.

> dirstate.rs:34
> +&contents[..PARENT_SIZE],
> +&contents[PARENT_SIZE..PARENT_SIZE * 2],
> +);

You check for PARENT_SIZE then index by PERENT_SIZE*2. This will panic if the 
size is between the two.

> dirstate.rs:36
> +);
> +let mut i = PARENT_SIZE * 2;
> +

It would be nice to define the above the definition of `parents` and use it in 
that slice.

> dirstate.rs:43
> +
> +let slice = &contents[i..];
> +let state = slice[0] as i8;

This isn't a very useful name. Maybe `entry` or `entry_bytes`.

> dirstate.rs:48
> +let mtime = BigEndian::read_i32(&slice[9..]);
> +let path_len = BigEndian::read_i32(&slice[13..]);
> +

This is a lot of manual index numbers. It would probably me more clear to wrap 
it in a Cursor and use 
https://docs.rs/byteorder/1.3.1/byteorder/trait.ReadBytesExt.html to read the 
values. This way the cursor is automatically moved to match the data you have 
read.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6348

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6345: rust-discovery: optionally don't randomize at all, for tests

2019-05-10 Thread kevincox (Kevin Cox)
kevincox accepted this revision.
kevincox added a comment.


  The code and rust style looks fine. I'm not sure that taking the N smallest 
elements is the best approach for testing though. I can imagine it hiding some 
issues with larger values.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6345

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6271: rust-filepatterns: add a Rust implementation of pattern-related utils

2019-04-24 Thread kevincox (Kevin Cox)
kevincox accepted this revision.
kevincox added a comment.


  Thanks for the changes.

INLINE COMMENTS

> Alphare wrote in filepatterns.rs:18
> I think I hit the same problem as the person in this issue: 
> `https://github.com/rust-lang/regex/issues/451`. It's a hassle to escape 
> bytes using the `regex` crate, and this felt like a good enough replacement.

Ah, I understand. This make sense then.

> Alphare wrote in filepatterns.rs:51
> It is required since `repl` is `&&[u8]`, which is not an iterator.

It is probably more clear to do `*repl` then to show that you are dereferencing 
it.

> Alphare wrote in filepatterns.rs:61
> This has a different behavior, since `position` expects a boolean return 
> value in the closure for each element of the iterator.

I don't understand. The example I showed would give a boolean. The only 
difference I see is that the returned value would be 1 less (because 
.position() wouldn't count the skipped element) but this should be easy to 
handle in the code and I think would be more clear overall.

> Alphare wrote in filepatterns.rs:226
> While your version is just plain better, removing the regex-based comment 
> escape breaks support for comments in `test-hgignore.t`, so this will still 
> be necessary.

Oops, I should have pointed out that I just commented that out because I don't 
have access to the `regex` crate on the website. It should still be included in 
the code.

This looks good now.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6271

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D6271: rust-filepatterns: add a Rust implementation of pattern-related utils

2019-04-20 Thread kevincox (Kevin Cox)
kevincox requested changes to this revision.
kevincox added inline comments.
This revision now requires changes to proceed.

INLINE COMMENTS

> filepatterns.rs:18
> +};
> +}
> +

Why not use https://docs.rs/regex/1.1.6/regex/fn.escape.html? If you are 
worried about performance you can use 
https://docs.rs/regex-syntax/0.6.6/regex_syntax/fn.escape_into.html which is 
quite fast.

> filepatterns.rs:21
> +const GLOB_REPLACEMENTS: &[(&[u8], &[u8])] =
> +&[(b"*/", b"(?:.*/)?"), (b"*", b".*"), (b"", b"[^/]*")];
> +

It is worth putting a comment that these are matched in order.

> filepatterns.rs:29
> +Glob,
> +// Glob that matches at any suffix of the path (still anchored at 
> slashes)
> +Path,

Put doc comments before the thing that they are documenting and use `///`

> filepatterns.rs:51
> +input = &input[source.len()..];
> +res.extend(repl.iter());
> +break;

I suspect this `.iter()` isn't required.

> filepatterns.rs:61
> +.enumerate()
> +.position(|(i, b)| *b == b']' && i != 0)
> +{

input.iter().skip(1).position(|b| *b == b']')

> filepatterns.rs:226
> +
> +pub fn parse_pattern_file_contents(
> +lines: &str,

This function appears to be way more complicated then it needs to be. I have 
tried to highlight the changes in comments below but I have a full version at 
https://rust.godbolt.org/z/08MrDP. I haven't actually run the code but I think 
my version has the same functionality as this one.

> filepatterns.rs:239
> +let line_number = line_number + 1;
> +let mut line = line.to_string();
> +let mut syntax = current_syntax.to_string();

If you are worried about performance you can use a buffer outside the loop and 
replace it with `line.replace_range(.., line_str)`. This saves continuously 
reallocating the string.

> filepatterns.rs:240
> +let mut line = line.to_string();
> +let mut syntax = current_syntax.to_string();
> +let mut syntax = syntax.as_ref();

This copy is unnecessary and having the lifetime start so early is confusing. 
The only time this is used is as a temporary in the `syntax:` case and to set 
`line_syntax` where it is always unchanged from the default. This variable 
should just be removed.

> filepatterns.rs:241
> +let mut syntax = current_syntax.to_string();
> +let mut syntax = syntax.as_ref();
> +

This is also unnecessary. Because of the above comment.

> filepatterns.rs:249
> +}
> +line = str::trim_end(line.as_ref()).to_string();
> +

You can just call `str` methods on `String` because they implement `Deref`.

  line = line.as_ref().trim_end().to_string();

> filepatterns.rs:259
> +if let Some(rel_syntax) = SYNTAXES.get(syntax) {
> +current_syntax = Cow::Owned(rel_syntax.to_string());;
> +} else if warn {

`rel_syntax: &'static str` so this can be `Cow::Borrowed`. Actually 
`current_syntax` is only ever set to static lifetime strings so you don't have 
to use `Cow` at all.

> filepatterns.rs:267
> +let mut line_syntax = syntax;
> +let mut final_line = line.clone();
> +

This clone is unnecessary it is always a reference to line from herein so you 
can just use `&str`.

> filepatterns.rs:282
> +inputs.push((
> +line_syntax.to_string() + &final_line,
> +line_number,

Slight preference for `format!()`. In theory it can do clever things like 
checking argument sizes to only allocate once. Either way I think it looks nice.

> filepatterns.rs:286
> +));
> +current_syntax = Cow::Owned(syntax.to_string());
> +}

I'm pretty sure this is redundant because the only time `syntax` is modified is 
in the `syntax:` case which A) Sets `current_syntax` and B) `continue`s.

> filepatterns.rs:299
> +Ok(parse_pattern_file_contents(&contents, &file_path, warn))
> +})?
> +}

Mapping to an always-ok result is odd. You should just map to the 
`parse_pattern_file_contents()`. Then you don't need to use `?` to unwrap the 
extra layer. However you then loose the error conversion which you can add back 
with `.map_err(|e| e.into())`.

https://rust.godbolt.org/z/Mflpga

Even better is just use the try for the read and call 
`parse_pattern_file_contents` at the function level.

  f.read_to_string(&mut contents)?;
  Ok(parse_pattern_file_contents(&contents, &file_path, warn))

> lib.rs:47
>  
> +pub type LineNumber = usize;
> +

This seems like overkill to me. I would either remove it or use `pub struct 
LineNumber(pub usize)` if you want to do this for type safety. However I don't 
think type safety is really needed here so I would just remove the alias.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6271

To: Alphare, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercu

  1   2   >