This is great stuff, Robert. Thanks for sharing.

On Tue, May 28, 2019, 06:10 Robert Knight <robertkni...@hypothes.is> wrote:

>
> At the moment this is still in the experimentation phase, but if it pans
> out well, I think they could be a good fit for the Apache Annotator
> project. The string matching implementation is fairly solid and offers
> significant performance improvements over diff-match-patch (see the README
> of the anchor-quote project for some early benchmarks). The quote anchoring
> library is very much in early development.


With what you know from this implementation and the research into
algorithms you've done for it, I'm wondering if you have any early
intuitions about API design questions I've been pondering.

So far, I've been ensuring that the selector API is incremental and
asynchronous. I've also made sure to support returning multiple matches.
For these reasons the selectors are asynchronous generators.

I have not explicitly addressed sharing state between selectors. Some
algorithms may benefit from doing a parallel search for multiple patterns
in a single pass. I noticed your API accepts multiple selectors, although
it seems to treat them independently. I was wondering what considerations
you had in mind when you made that API design decision.

The API on master here currently separates the parse operation that takes a
selector string/object, returning a scoped match function. This provides an
opportunity to close over shared state held by the parser/constructor.

The parse step does not initially scope the selector to any context. It's
not possible to know, until the match function is invoked, that multiple
selectors are applicable to the same scope. I suspect this concern can be
lifted by binding the parser to a root scope. Therefore, I don't think
there's any action here, but I figured I'd mention it.

Another thing which I have considered and, again, so far made no explicit
affordance for, is threading a deadline or abort controller through the
call stack so that match operations can be cancelable.

If you have any strong reactions to these thoughts, I'd be curious to hear
them.

I'd love to find some alignment that allows you to bring this work into the
tree and invite you to contribute to directly.

I'd also consider adopting typescript, if that is appealing to the group.
That's probably a very easy change to our babel setup.

Reply via email to