Kevin Atkinson said: To spellchecker with these special words you need to keep Aspell from trying to tokenize a string into words (i.e, "hello world!" gets split into "hello" and "world"). I think you need to use the C API for this. Thanks. I understand now that my questions are to do with the tokenization, rather with the spellchecking itself.
In applications like the Aspell demo for Windows in Delphi, aspell can leave the tokenization to the calling application, which can pass the words, one at a time, to aspell_speller_check - is that correct? So if I was writing my own application to use aspell, I would not have a problem - I would just do the tokenization myself. But I see that there are procedures in aspell (in the CPI?), for instance, aspell_document_checker_next_misspelling, which seems to accept a LINE of text and tokenize it before testing the words. I suppose this may be how the "aspell" command-line program does its tokenization, and probably also applications like UltraEdit (which I use a lot) will avail themselves of it. I'm not a C programmer, but if I knew where to look in the aspell source, I could try and see how difficult it would be to modify the tokenization there to treat apostrophe and hyphen as I want to, either in response to a command-line option, or even automatically, by looking at the special status of these characters in the relevant dictionary. For the apostrophe, there must already be code to keep a word-internal apostrophe, while removing a word-marginal one. The modification would be to keep the apostrophe in any position. For the hyphen, the modification would be to check the whole word first; and, if not found, then the present default of checking the parts would be applied. Does it sound feasible? Any hints? Ciarán Ó Duibhín. _______________________________________________ Aspell-user mailing list Aspell-user@gnu.org https://lists.gnu.org/mailman/listinfo/aspell-user