Re: Integrating Annotator into Memex

Oliver Sauter Sun, 26 Apr 2020 09:56:39 -0700

Hey Gerben! 

Thanks for your extensive answer. 
> My two cents for the first two questions: I suppose Annotator as it is today 
> is ready to play with but not to use without contributing to it; but making 
> it fit use cases like yours is exactly the goal, so this is a great target to 
> work towards. The main focus is on TextQuoteSelectors in HTML documents, 
> which seems to fit the scenario. As for how to help, we may need to draw up 
> more specific tasks&challenges to lower the barrier to do things 
> independently; but until then (and to make that happen!), we’d be glad if 
> people willing to get involved pop into a weekly call or IRC (#annotator on 
> freenode), or try write on this list or open issues with things that block 
> them. Also creating test cases would be really helpful to find out what needs 
> fixing. My guess is that with a couple of person-days (or rather -weeks, to 
> do things properly) we could get it to a state where it does the tricks you 
> need for the majority of cases.
> 
When is the next call?


> I especially like the described scenario with two types of views, as it 
> creates an interoperability challenge within a single project. As Benjamin 
> indicates, if the viewer only changes HTML but not text content, perhaps 
> exact text quote matches would already work. Implementing fuzzy anchoring 
> would help overcome slight mismatches (and something we’d like to provide 
> anyway), or perhaps just some improved whitespace normalisation would 
> suffice. This use case may be a good experiment to discover the limitations 
> and possibilities of the anchoring algorithm.
> 
> By the way, could you perhaps clarify what you mean with the hypothes.is 
> <http://hypothes.is/> annotation library?
> 
AFAIU we use the library that hypothesis.is uses (or ripped it out in some way).

Cheers
OIiver



> — Gerben
> 
> 
> 
> On 23/04/2020 16:02, Benjamin Young wrote:
>> Thanks for posting that here, Oliver!
>> 
>> We'll probably have to take those questions one at a time. 🙂 I'd actually 
>> like to start with the last one--as it spells out some target APIs and code.
>> 
>>> 3. What do you think will be the challenges to get Apache Annotator work on 
>>> both a reader and a full-html version?
>> If I understand the use case correctly, you're wanting annotation on 
>> something like Firefox's "reader view" (where the original HTML is stripped 
>> away, and only the content remains) and wanting those same annotations to be 
>> re-anchor-able on the original HTML (and vice versa).
>> 
>> If that's indeed what you're after, then the "hard" part is making sure we 
>> have a way for implementations to "opt-in" to fuzzy anchoring when they both 
>> create and use an annotation.
>> 
>> For starters, you could simply store the TextQuoteSelector which *should* 
>> re-anchor on both those representations (and possibly even on a PDF), but it 
>> would come at the cost of performance on large documents. So, what you'd 
>> want to follow that up with is additional, narrower, more brittle selectors 
>> which would (knowingly) fail when you switch representations, but would give 
>> you better performance on a specific representation--i.e. you'd have an 
>> XPath or CSS selector for the original HTML which would fail on the "reader" 
>> and/or "PDF" view at which point you'd (knowingly) fall back to the 
>> TextQuoteSelector.
>> 
>> I think the core "plumbing" for that is already available, but Randall or 
>> Gerben would know better. 🙂
>> 
>> Is that what you're after?
>> 
>> Cheers,
>> Benjamin
>> 
>> 
>> --
>> 
>> http://bigbluehat.com/ <http://bigbluehat.com/>
>> 
>> http://linkedin.com/in/benjaminyoung <http://linkedin.com/in/benjaminyoung>
>> 
>> ________________________________
>> From: Oliver Sauter <[email protected]> <mailto:[email protected]>
>> Sent: Wednesday, April 22, 2020 12:17 PM
>> To: [email protected] 
>> <mailto:[email protected]> 
>> <[email protected]> 
>> <mailto:[email protected]>
>> Subject: Integrating Annotator into Memex
>> 
>> Hey folks,
>> 
>> I just had a call with Benjamin and we talked about the ability to integrate 
>> annotator into getmemex.com <http://getmemex.com/> <http://getmemex.com/>
>> Right now we use the Hypothes.is <http://hypothes.is/> <http://hypothes.is/> 
>> library but it is causing us some troubles (mainly the ram usage for hooking 
>> it into each tab)
>> 
>> But also we are about to start the development of the Pocket-style 
>> offline-reader for desktop and mobile on which we want to also integrate 
>> annotation capabilities.
>> This means there is an anticipated use case where people annotate on a 
>> reader-version and want to see the annotations also successfully anchored on 
>> a live html page. Annotating a reader-version will be missing a lot of 
>> details usually used for anchoring the annotations, so the challenge would 
>> be to make those interoperable with Apache Annotator.
>> 
>> So the questions I have:
>> 1. How mature is Annotator in terms of its ability to replace the hypothesis 
>> annotation library? What still needs to be done (and how much work for that? 
>> Where do you need help?)
>> 2. How much work do you anticipate for a replacement?
>> 3. What do you think will be the challenges to get Apache Annotator work on 
>> both a reader and a full-html version?
>> 
>> I’ve been looking forward to find a way to collaborate so hopefully this 
>> time is the time!
>> Cheers
>> Oliver
>> 
>> 
>> 
>

Re: Integrating Annotator into Memex

Reply via email to