Thanks or the article, very helpful. And yes, I too remember Ithaca, though most of it was spent downing bottles of diet coke late at night in Philips hall.
Michael Sander michael.san...@gmail.com 607-227-9859 On Mon, Apr 29, 2013 at 10:54 PM, Sven Pedersen <sven.peder...@gmail.com>wrote: > You appear to be a fellow Ithacan! (I no longer live there, but remember > it fondly.) > > Anyway, other common ligatures include ff, ffi, ffl, fb, fy, ft > http://ilovetypography.com/2007/09/09/decline-and-fall-of-the-ligature/ > Sven > > On Monday, April 29, 2013, Michael Sander wrote: > >> Yes, I'm doing something similar in python. Do you know of a list of a >> ligatures so I can convert them to ascii? I know fi and fl are the most >> popular, but there are probably many more. >> >> >> Michael Sander >> michael.san...@gmail.com >> 607-227-9859 >> >> >> On Mon, Apr 29, 2013 at 7:48 PM, Greg Dunkel <drdunk.g...@gmail.com>wrote: >> >>> I couldn't get the config to work on Ubuntu so I wrote a post-processing >>> sed script to convert the ligatures to two characters. >>> >>> >>> On Mon, Apr 29, 2013 at 3:45 AM, Michael Sander < >>> michael.san...@gmail.com> wrote: >>> >>>> How did you format your config file? I tried adding the following line >>>> and it doesn't seem to work: >>>> >>>> tessedit_char_blacklist fi >>>> >>>> >>>> On Sunday, April 1, 2012 5:16:59 AM UTC-4, klo wrote: >>>>> >>>>> Thanks. I added it to my tesseract configuration file and it works >>>>> great >>>>> >>>>> Cheers >>>>> >>>>> >>>>> On Saturday, March 31, 2012 10:12:50 PM UTC+2, zdpo wrote: >>>>>> >>>>>> >>>>>> Dňa 31.03.2012 16:17, klo wrote / napísal(a): >>>>>> >>>>>> In my simple testing, I find this most common problem, is there a way to >>>>>> instruct tesseract not to use those glyphs without limiting it to ASCII? >>>>>> >>>>>> I use tesseract 3.01 BTW >>>>>> >>>>>> >>>>>> put them to blacklist with variable tessedit_char_blacklist (search >>>>>> forum if you do not know how). >>>>>> >>>>>> Zdenko >>>>>> >>>>>> -- >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To post to this group, send email to tesseract-ocr@googlegroups.com >>>> To unsubscribe from this group, send email to >>>> tesseract-ocr+unsubscr...@googlegroups.com >>>> For more options, visit this group at >>>> http://groups.google.com/group/tesseract-ocr?hl=en >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>>> >>>> For more options, visit https://groups.google.com/groups/opt_out. >>>> >>>> >>>> >>> >>> >>> >>> -- >>> /greg >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to tesseract-ocr@googlegroups.com >>> To unsubscribe from this group, send email to >>> tesseract-ocr+unsubscr...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/tesseract-ocr?hl=en >>> >>> --- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "tesseract-ocr" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/tesseract-ocr/jO_4ZMMK9xw/unsubscribe?hl=en >>> . >>> To unsubscribe from this group and all its topics, send an email to >>> tesseract-ocr+unsubscr...@googlegroups.com. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> >>> >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to tesseract-ocr@googlegroups.com >> To unsubscribe from this group, send email to >> tesseract-ocr+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> > > > -- > ``All that is gold does not glitter, > not all those who wander are lost; > the old that is strong does not wither, > deep roots are not reached by the frost. > From the ashes a fire shall be woken, > a light from the shadows shall spring; > renewed shall be blade that was broken, > the crownless again shall be king.” > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to tesseract-ocr@googlegroups.com > To unsubscribe from this group, send email to > tesseract-ocr+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to a topic in the > Google Groups "tesseract-ocr" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/tesseract-ocr/jO_4ZMMK9xw/unsubscribe?hl=en > . > To unsubscribe from this group and all its topics, send an email to > tesseract-ocr+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.