Hey Gerben! Thanks for your extensive answer. > My two cents for the first two questions: I suppose Annotator as it is today > is ready to play with but not to use without contributing to it; but making > it fit use cases like yours is exactly the goal, so this is a great target to > work towards. The main focus is on TextQuoteSelectors in HTML documents, > which seems to fit the scenario. As for how to help, we may need to draw up > more specific tasks&challenges to lower the barrier to do things > independently; but until then (and to make that happen!), we’d be glad if > people willing to get involved pop into a weekly call or IRC (#annotator on > freenode), or try write on this list or open issues with things that block > them. Also creating test cases would be really helpful to find out what needs > fixing. My guess is that with a couple of person-days (or rather -weeks, to > do things properly) we could get it to a state where it does the tricks you > need for the majority of cases. > When is the next call?
> I especially like the described scenario with two types of views, as it > creates an interoperability challenge within a single project. As Benjamin > indicates, if the viewer only changes HTML but not text content, perhaps > exact text quote matches would already work. Implementing fuzzy anchoring > would help overcome slight mismatches (and something we’d like to provide > anyway), or perhaps just some improved whitespace normalisation would > suffice. This use case may be a good experiment to discover the limitations > and possibilities of the anchoring algorithm. > > By the way, could you perhaps clarify what you mean with the hypothes.is > <http://hypothes.is/> annotation library? > AFAIU we use the library that hypothesis.is uses (or ripped it out in some way). Cheers OIiver > — Gerben > > > > On 23/04/2020 16:02, Benjamin Young wrote: >> Thanks for posting that here, Oliver! >> >> We'll probably have to take those questions one at a time. 🙂 I'd actually >> like to start with the last one--as it spells out some target APIs and code. >> >>> 3. What do you think will be the challenges to get Apache Annotator work on >>> both a reader and a full-html version? >> If I understand the use case correctly, you're wanting annotation on >> something like Firefox's "reader view" (where the original HTML is stripped >> away, and only the content remains) and wanting those same annotations to be >> re-anchor-able on the original HTML (and vice versa). >> >> If that's indeed what you're after, then the "hard" part is making sure we >> have a way for implementations to "opt-in" to fuzzy anchoring when they both >> create and use an annotation. >> >> For starters, you could simply store the TextQuoteSelector which *should* >> re-anchor on both those representations (and possibly even on a PDF), but it >> would come at the cost of performance on large documents. So, what you'd >> want to follow that up with is additional, narrower, more brittle selectors >> which would (knowingly) fail when you switch representations, but would give >> you better performance on a specific representation--i.e. you'd have an >> XPath or CSS selector for the original HTML which would fail on the "reader" >> and/or "PDF" view at which point you'd (knowingly) fall back to the >> TextQuoteSelector. >> >> I think the core "plumbing" for that is already available, but Randall or >> Gerben would know better. 🙂 >> >> Is that what you're after? >> >> Cheers, >> Benjamin >> >> >> -- >> >> http://bigbluehat.com/ <http://bigbluehat.com/> >> >> http://linkedin.com/in/benjaminyoung <http://linkedin.com/in/benjaminyoung> >> >> ________________________________ >> From: Oliver Sauter <o...@worldbrain.io> <mailto:o...@worldbrain.io> >> Sent: Wednesday, April 22, 2020 12:17 PM >> To: dev@annotator.incubator.apache.org >> <mailto:dev@annotator.incubator.apache.org> >> <dev@annotator.incubator.apache.org> >> <mailto:dev@annotator.incubator.apache.org> >> Subject: Integrating Annotator into Memex >> >> Hey folks, >> >> I just had a call with Benjamin and we talked about the ability to integrate >> annotator into getmemex.com <http://getmemex.com/> <http://getmemex.com/> >> Right now we use the Hypothes.is <http://hypothes.is/> <http://hypothes.is/> >> library but it is causing us some troubles (mainly the ram usage for hooking >> it into each tab) >> >> But also we are about to start the development of the Pocket-style >> offline-reader for desktop and mobile on which we want to also integrate >> annotation capabilities. >> This means there is an anticipated use case where people annotate on a >> reader-version and want to see the annotations also successfully anchored on >> a live html page. Annotating a reader-version will be missing a lot of >> details usually used for anchoring the annotations, so the challenge would >> be to make those interoperable with Apache Annotator. >> >> So the questions I have: >> 1. How mature is Annotator in terms of its ability to replace the hypothesis >> annotation library? What still needs to be done (and how much work for that? >> Where do you need help?) >> 2. How much work do you anticipate for a replacement? >> 3. What do you think will be the challenges to get Apache Annotator work on >> both a reader and a full-html version? >> >> I’ve been looking forward to find a way to collaborate so hopefully this >> time is the time! >> Cheers >> Oliver >> >> >> >