Re: Capturing text from Firefox

Benjamin Smedberg Fri, 18 Oct 2013 05:56:43 -0700

On 10/17/2013 7:30 PM, Look, Yuriy wrote:

I am working on GUI automation component of a performance monitoring product.  
One of the common approaches to monitoring application is periodically capture 
text from the control where changes are expected (content area of the browser 
for Web applications).  Text capturing ideally captures all text, including not 
selectable and user input.

Have you looked into reusing the existing Selenium browser automation?http://docs.seleniumhq.org/

It's not clear exactly what kinds of problems you are trying to solve,but the Mozilla content layer already has ways to expose pretty much allof the DOM text, including nonselectable text, via various APIsincluding the DOM itself, accessibility API, and some other lower-levelfunctions we use for find-in-page.

In the product I work on this is achieved by (1) forcing the application to 
re-draw the texts in the window or part of the window of interest and (2) 
hooking the functions that responsible for text drawing during the time 
interval of the capturing is performed.  Hooking is performed by modifying the 
first bytes of the binary code of the hooked functions to jump to the hooking 
functions, which process the same parameters and then jump back to allow the 
hooked function to perform their job.

This will certainly not work in the future, see below.


Which functions/technologies are drawing the text?
         Is drawing performed by "normal" Windows APIs, like DrawTextEx or 
ExtTextOut, or this is no longer the case?

No. If I'm reading our bugs correctly, we're currently using acombination of harfbuzz (http://freedesktop.org/wiki/Software/HarfBuzz/)and uniscribe/directwrite. We use uniscribe only for Hangul, Mongolian,Indic, and Thai text, and intend to eventually use harfbuzz for all textrendering.

         Does it delegates drawing to another process?

Not yet, but we're working on having content processes similar to theway many other web browsers do.

Does Ff caches drawn text, say, in memory device contexts, so that in case the 
window or a region needs to be repainted, text does not need to be redrawn and 
widow device context is updated through functions like BitBlt?

Yes.

   If so, can such caching be disabled programmatically or through 
configuration?

No, I don't think it's possible to disable the layer system any more.

Does Ff patch Windows DLLs?

In a few specific cases, yes, but primarily to enforce a DLL blocklistfor stability issues and to ensure that our crash reporting system isn'ttampered with. In plugin processes we also hook a few event-systemfunctions. I don't think that any of our hooking should affect thegraphics/text subsystems.


Are there other approaches to capturing text form Ff you can suggest?

Ultimately, I don't think that trying to capture text by hooking drawingfunctions is going to be successful in Firefox. You probably need tolook at some combination of accessibility and DOM APIs, depending onyour actual use case.


--BDS

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Capturing text from Firefox

Reply via email to