This is probably peripheral to the discussion on Emdash / Endash and Legacy 
FT's clean-up abilities.

One source of information that I use a lot is from digitized (Australian) 
newspapers, at Trove<http://trove.nla.gov.au/newspaper/>. The OCR text usually 
requires correction (human recognition)  and that includes changing Emdash to 
hyphen, inserting spaces, concatenating hyphenated words, etc etc.

But a simple Copy then Paste into a Notes field in Legacy isn't an option for 
me. Stripping the HTML doesn't improve the situation - but that's fine by me.

Routinely, I paste into a plain-text editor set up in a simple Encoding (for PC 
of course) - using an Encoding like Unicode BOM is asking for trouble, I have 
found. Then, I create accurate text but sometimes leave it in newspaper column 
format - or, make it more readable in continuous-text paragraphs.

This process is much lengthier than just changing a few single-symbol bits like 
an Emdash. But I have seen the 'conversion' by Legacy to '97' which is why I 
adopted the method described.

I find it has a benefit - I read more attentively, and understand better. Some 
of the articles - and even common-place advertisements - are fascinating.

>From Trove, an image or PDF file at a selectable range of resolutions can be 
>saved to disk, so the scanned source imagery is available and can be attached 
>to Legacy as a media file, if warranted.



Ian Thomas

Albert Park, Victoria 3206 Australia
-- 

LegacyUserGroup mailing list
LegacyUserGroup@legacyusers.com
To manage your subscription and unsubscribe 
http://legacyusers.com/mailman/listinfo/legacyusergroup_legacyusers.com
Archives at:
http://www.mail-archive.com/legacyusergroup@legacyusers.com/

Reply via email to