D7866: rust-pathauditor: add Rust implementation of the `pathauditor`

martinvonz (Martin von Zweigbergk) Thu, 23 Jan 2020 10:19:29 -0800

This revision now requires changes to proceed.
martinvonz added inline comments.
martinvonz requested changes to this revision.
martinvonz added subscribers: yuja, martinvonz.


INLINE COMMENTS

> path_auditor.rs:55
> +            .collect();
> +        if !path.split_drive().0.is_empty()
> +            || [&b".hg"[..], &b".hg."[..], &b""[..]]

As @yuja pointed out on D7864 <https://phab.mercurial-scm.org/D7864>, it's 
weird to have `split_drive()` on `HgPath`, because that is supposed to be a 
repo-relative path. I think it would be better to move it into this file (or 
elsewhere). Can be done in a follow-up.

> path_auditor.rs:70
> +                let mut first = split.next().unwrap().to_owned();
> +                first.make_ascii_uppercase();
> +                let last = split.next().unwrap();

nit: chain `to_ascii_uppercase()` onto the previous line (and drop `mut`) 
instead

> path_auditor.rs:72
> +                let last = split.next().unwrap();
> +                if last.iter().all(|b| b.is_ascii_digit())
> +                    && [&b"HG"[..], &b"HG8B6C"[..]].contains(&first.as_ref())

or is `u8::is_ascii_digit` preferred? (just a question; i don't know which is 
preferred)

> path_auditor.rs:73
> +                if last.iter().all(|b| b.is_ascii_digit())
> +                    && [&b"HG"[..], &b"HG8B6C"[..]].contains(&first.as_ref())
> +                {

Is this line equivalent to `&& (first == b"HG" || first == b"HG8B6C")`? If so, 
I would strongly prefer that.

Same comment on line 56, actually (I reviewed out of of order). Would probably 
be helpful to extract `&lower_clean(parts[0]).as_ref()` to a variable anyway.

> path_auditor.rs:83-84
> +        {
> +            let lower_parts: Vec<_> =
> +                parts.iter().map(|p| lower_clean(p)).collect();
> +            for pattern in [b".hg".to_vec(), b".hg.".to_vec()].iter() {

Would we ever get different results from `lower_clean(path).split()` and 
`split(path).map(lower_clean)`? If not, shouldn't we reuse the 
`lower_clean()`ed result from above and call `split()` on that?

> path_auditor.rs:86
> +            for pattern in [b".hg".to_vec(), b".hg.".to_vec()].iter() {
> +                if lower_parts[1..].contains(pattern) {
> +                    let pos = lower_parts

What's the `1` about? It seems to say it's okay if the path has `.hg` as its 
first component. Why?

> path_auditor.rs:87-90
> +                    let pos = lower_parts
> +                        .iter()
> +                        .position(|part| part == pattern)
> +                        .unwrap();

Can we eliminate this by using `.enumerate()` on line 85?

> path_auditor.rs:91-95
> +                    let base = lower_parts[..pos]
> +                        .iter()
> +                        .fold(HgPathBuf::new(), |acc, p| {
> +                            acc.join(HgPath::new(p))
> +                        });

Will it be confusing that this is not necessarily a prefix of `path`? 
Specifically, it seems like it's going to be lower-case and with possibly 
different path separators.

> path_auditor.rs:114
> +                &parts[..index + 1].join(&(std::path::MAIN_SEPARATOR as u8));
> +            let prefix = HgPath::new(prefix);
> +            if self.audited_dirs.contains(prefix) {

Should this be a `PathBuf`? It's weird to have a `HgPath` containing 
`std::path::MAIN_SEPARATOR`.

> path_auditor.rs:123-124
> +        self.audited.insert(path.to_owned());
> +        // Only add prefixes to the cache after checking everything: we don't
> +        // want to add "foo/bar/baz" before checking if there's a "foo/.hg"
> +        self.audited_dirs.extend(prefixes);

It looks like we check the prefixes in order, so first "foo", then "foo/bar", 
then "foo/bar/baz". We'd return early if we find "foo/.hg", before we get to 
"foo/bar", no? What am I missing?

> path_auditor.rs:144
> +                // EINVAL can be raised as invalid path syntax under win32.
> +                // They must be ignored for patterns can be checked too.
> +                if e.kind() != std::io::ErrorKind::NotFound

Could you replace "for" by "because" if that's what it means here. Or fix the 
sentence if that's not what you mean.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7866/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7866

To: Alphare, #hg-reviewers, kevincox, martinvonz
Cc: martinvonz, yuja, durin42, kevincox, mercurial-devel
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

D7866: rust-pathauditor: add Rust implementation of the `pathauditor`

Reply via email to