RE: Capturing text from Firefox

Look, Yuriy Fri, 18 Oct 2013 12:25:43 -0700

Benjamin,

Thank you very much for the answers.  We'll need to way our options.


Thank you,

Yuriy


-----Original Message-----
From: Benjamin Smedberg [mailto:benja...@smedbergs.us] 
Sent: Friday, October 18, 2013 8:55 AM
To: Look, Yuriy; dev-platform@lists.mozilla.org
Subject: Re: Capturing text from Firefox

On 10/17/2013 7:30 PM, Look, Yuriy wrote:
> I am working on GUI automation component of a performance monitoring product. 
>  One of the common approaches to monitoring application is periodically 
> capture text from the control where changes are expected (content area of the 
> browser for Web applications).  Text capturing ideally captures all text, 
> including not selectable and user input.
Have you looked into reusing the existing Selenium browser automation? 
http://docs.seleniumhq.org/

It's not clear exactly what kinds of problems you are trying to solve, but the 
Mozilla content layer already has ways to expose pretty much all of the DOM 
text, including nonselectable text, via various APIs including the DOM itself, 
accessibility API, and some other lower-level functions we use for find-in-page.

> In the product I work on this is achieved by (1) forcing the application to 
> re-draw the texts in the window or part of the window of interest and (2) 
> hooking the functions that responsible for text drawing during the time 
> interval of the capturing is performed.  Hooking is performed by modifying 
> the first bytes of the binary code of the hooked functions to jump to the 
> hooking functions, which process the same parameters and then jump back to 
> allow the hooked function to perform their job.
This will certainly not work in the future, see below.
>
> Which functions/technologies are drawing the text?
>          Is drawing performed by "normal" Windows APIs, like DrawTextEx or 
> ExtTextOut, or this is no longer the case?
No. If I'm reading our bugs correctly, we're currently using a combination of 
harfbuzz (http://freedesktop.org/wiki/Software/HarfBuzz/)
and uniscribe/directwrite. We use uniscribe only for Hangul, Mongolian, Indic, 
and Thai text, and intend to eventually use harfbuzz for all text rendering.
>          Does it delegates drawing to another process?
Not yet, but we're working on having content processes similar to the way many 
other web browsers do.
> Does Ff caches drawn text, say, in memory device contexts, so that in case 
> the window or a region needs to be repainted, text does not need to be 
> redrawn and widow device context is updated through functions like BitBlt?
Yes.
>    If so, can such caching be disabled programmatically or through 
> configuration?
No, I don't think it's possible to disable the layer system any more.
> Does Ff patch Windows DLLs?
In a few specific cases, yes, but primarily to enforce a DLL blocklist for 
stability issues and to ensure that our crash reporting system isn't tampered 
with. In plugin processes we also hook a few event-system functions. I don't 
think that any of our hooking should affect the graphics/text subsystems.


>
> Are there other approaches to capturing text form Ff you can suggest?
Ultimately, I don't think that trying to capture text by hooking drawing 
functions is going to be successful in Firefox. You probably need to look at 
some combination of accessibility and DOM APIs, depending on your actual use 
case.

--BDS



_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

RE: Capturing text from Firefox

Reply via email to