We had the idea to have the OCR text on separate URLs by page (or 
similar) to improve search accessibility a few years ago and we may yet 
get there.  We're working on having the OCR text available for reading 
and correction (may not immediately be integrated with the BookReader).

For the BookReader I might go with the new #! url fragments that are 
designed to allow web apps to dynamically update the url while still 
being accessible to search engines.
http://code.google.com/web/ajaxcrawling/docs/specification.html

   - mang

On 6/11/11 7:49 PM, Lars Aronsson wrote:
> Reading my own question again, I understand I didn't phrase it
> very well:
>> Can this be combined with making the text searchable
>> by web search engines, like plain web pages?
> Here's what I envision, and my question is if you have
> any plans going in this direction:
>
> In the bookreader, one should not only be able to zoom
> in and out or to activate the sound playback, but also to
> view the OCR text and proofread the OCR text (like a
> wiki page). To a search engine spider, only the view text
> option should be available, and the buttons for previous
> and next page should be plain links, so the text of each
> page gets indexed under the right page URL.
>
> The way I would want the bookreader to appear to a
> search spider is the way my existing website looks,
> this example being the first page of Hamlet, in the
> Swedish translation of 1861,
> http://runeberg.org/hagberg/a/0183.html
> Here is the scanned book page, and you can scroll
> down to the OCR text below.
>
> If you google the role names "Voltimand, Cornelius,
> Rosenkranz, Gyldenstern", you will see that it
> is indexed by Google at this very URL. (English and
> German editions spell the names a little different.)
>
> I'd like to use the bookreader with its soft scrolling
> and book page flipping for humans, but I don't
> want to give up the direct per page indexing by
> Google and other search engines. So, can the
> two be combined? Did anybody try this?
>
>

_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to