I just sent this message to the Lyx-Devel. Names of other projects I should contact about my intentions would be most appreciated. -------- Original Message -------- Subject: Aspell and LyX Date: Wed, 02 Feb 2000 04:59:44 -0500 From: Kevin Atkinson <[EMAIL PROTECTED]> To: LyX-Developers <[EMAIL PROTECTED]> Back in February of 1999 I posted A proposal to integrate Aspell into Lyx. I attached the relevant parts of the conversation as a text file for quick review of those who where here and to bring those new to this list up to speed. However since then I have releases that all to many programs are not implanting there own spell checker which has suggestion intelligence about the same of ispell but does not actually use ispell. THis means that there is no way to use the better suggestion intelligence of aspell since there is no way to change the spell checker used. Unfortunately my spell checker has two barriers against it being adapted by mainstream Open Source programs. 1) It is written in C++ and all two many Open Source projects are still in pure C. 2) It is written in very modern C++ which means it is not the most portable thing in the world. So, what I would like to now is instead of coming up with a interface for just LyX I would like to come up with a pure C interace/library which will use aspell if it is available and if not use Ispell. Are you up to working with me on designing such an interface? I will handle the Aspell interface while I will late someone else handle the ispell interface. I will also need lots of help because I have no clue how to dynamically load code at run time. The code you write for this library will need to be under the LGPL as also want commercial programs to be able to use it. My eventual goal is to have ALL programs use this library instead of either using ispell directory through a pipe or writing a spell checker of there own. Other places where I should post my intentions would be appreciated as I want this project to get as much exposure as it can. -- Kevin Atkinson [EMAIL PROTECTED] http://metalab.unc.edu/kevina/
Date: Wed, 11 Nov 1998 12:05:02 +0100 From: Asger Alstrup Nielsen <[EMAIL PROTECTED]> To: Kevin Atkinson <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: LyX 1.0 and Aspell Hi! I'm forwarding this to the LyX list in the hope that somebody will implement this small and useful feature request. > Would it be possible to incorporate an option in the (pre-)Release > version of LyX that will allow the user to chose the spell checker > command somewhere in the spell checker options dialog box? This way > people can use my spell checker with out having to rename ispell or some > sort of other fancy trick. The easy solution is to provide a lyxrc command which can specify the spelling command. Could somebody do that? It should take half an hour if you have the sources compiled and set up for work. Date: Mon, 28 Dec 1998 15:05:17 +0100 (MET) From: Jean-Marc Lasgouttes <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Patch for LyX to work better with Aspell >>>>> "Kevin" == Kevin Atkinson <[EMAIL PROTECTED]> writes: Kevin> Here is a patch that will allow Aspell to learn from users Kevin> mistakes when used with LyX. All it does is store the Kevin> replacement pairs when it detects that aspell is being used Kevin> instead of Ispell. Kevin> I would appreciate it if you could apply the patch to the 1.0 Kevin> branch because the change is minor. However I will understand Kevin> if you think it is two major of a change. Hello, I added your patch to the 1.0 since it looks simple enough. Thanks. Date: Tue, 16 Feb 1999 04:00:35 +0000 From: Kevin Atkinson <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Aspell and LyX 1.1 Hi. There I was wondering if you are still interested in using Aspell (http://metalab.unc.edu/kevina/aspell) as the new LyX 1.1 spell checker. I would be willing to help you out if you would point me in the right direction. The reason I ask is because I would like Aspell to incorporated in at least one large project before I am conferrable with realizing it to version 1.0. The interface is still is a current state of flux however it should stabilize soon. Early feed back on what sort of things you are looking for would be more than appreciated. From: Jean-Marc Lasgouttes <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 It could be nice as long as it is optional (IMO). I guess we should have a generic spellchecker interface that is plugged at compile time to either ispell, aspell (library version, I guess) or KSpell (for klyx). However, I think we have to keep the support for plain old ispell. From: Kevin Atkinson <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 Some of the fancy things you will soon be able to do in aspell that you can't do (or will be very difficult to do) in ispell. 1) Have a diffrent "ignore all" word list for each document so that you won't keep aving to press ignore for special words you are not willing to insert into you personal dictiionares. 2) Skip over url's, host names, and email addresses. 3) Intellegenly spell check code and mathematics (aspell will figure out which words are variable names and skip over them. See the mailing list archive for how I plan on doing this) 4) Learn from users misspellings 5) Finally a much better suggestion strategy. Your current code will allow aspell to do #5 correctly. I have submitted code for #4 however it has a few problems. Aspell can't do #2 because you insist on sending things one word at a time which breaks up url's and the like. Being able to do #3 will require a prescan of the document with all the symbols in tack and with out any sort of artificially breaking up of the text like you currently do. And #1 will requore you store word lists with the document. So basically in order to support Aspell in the fullest your current spell checker code will require a major rewrite. From: Asger K. Alstrup Nielsen <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 > Kevin> So basically in order to support Aspell in the fullest your > Kevin> current spell checker code will require a major rewrite. > > Anyway, the spellchecker code needs a major rewrite. By `support', do > you mean a pipe-based interface as we have now, or the use of aspell > as a library? Kevin, you are more than welcome to rewrite the spell checking interface in LyX. The requirements are simple to present: All of what the current spell checker can do, and a few other additions: 1) Local words. 2) Easier support for different spell checkers. (on other platforms, such as windows, there is probably a system API for this.) 3) Hide the spell checker communication. Ideally, I'd like to have an interface where we pass a string const & of words that we want to spell check, and get a vector<pair<string const &, vector<string const &> > back, where each misspelled word in the string has been mapped to a list of potential replacement words. (The current restriction that we only spell check one word at a time should be lifted, because this is unnecessarily restrictive. For instance, the spell checking interface should also be flexible enough for grammar checking.) All the behind-the-scenes communication with the spell checker should be hidden from the user. If you feel up to it, present a design here, and we can comment on it before you implement it. Date: 18 Feb 1999 20:50:10 +0100 From: Lars Gullik Bj�nnes <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 >> Kevin Atkinson writes: KA> 1) Have a diffrent "ignore all" word list for each document so This is planned, and is also fairly easy to do for ispell. KA> 2) Skip over url's, host names, and email addresses. When we have character styles, this will be easier. KA> And #1 will requore you store word lists with the document. Will be there in 1.1.x KA> So basically in order to support Aspell in the fullest your KA> current spell checker code will require a major rewrite. Yes, and this is planned. (help is good) From: Kevin Atkinson <[EMAIL PROTECTED]> To: "Lars Gullik [iso-8859-1] Bj�nnes" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 [ The following text is in the "iso-8859-1" character set. ] [ Your display is set for the "US-ASCII" character set. ] [ Some characters may be displayed incorrectly. ] "Lars Gullik Bj�nnes" wrote: > > >> Kevin Atkinson writes: > KA> 1) Have a diffrent "ignore all" word list for each document so > > This is planned, and is also fairly easy to do for ispell. Ok. But how do you manage multiple documents. Do you have a seperate ispell process for each open document or do you load and unload the word list before spell checking a specific document. Unloading and loading will work fine if you don't plan to do spell checking while you type. If you plan to do spell checking wiile you type you would almost certanly need a seperate process for each open document. Well I guess you could load and unload the word list each time you change documents. But them how will you handle having multiple docvuments visable at once. Aspell avoids this problem by having detachable dictionaries. Thus you can have multiple Aspell classes which share the main word list. Each of these Aspell classes can also have a separate "ignore all" dictionary. In fact with aspell you can have as many dictionaries as you like. All of them being completely detachable. From: Lars Gullik Bj�nnes <[EMAIL PROTECTED]> To: Kevin Atkinson <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 >> Kevin Atkinson writes: KA> Ok. But how do you manage multiple documents. Do you have a For Ispell we will most likely have to use multiple processes. Note that I said "easy for ispell". I did not say elegant :-) KA> Aspell avoids this problem by having detachable dictionaries. This sounds really nice. KA> So basically in order to support Aspell in the fullest your KA> current spell checker code will require a major rewrite. Until aspell is the defacto speller in the unix world we will need to have an interface to ispell too. and hopefully we will be able to have an abstract intervace to the different spelling processes. From: Jean-Marc Lasgouttes <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 Kevin> Ok. But how do you manage multiple documents. Do you have a We had plans for multiple ispell processes, but the reason was rather multi-language documents support. Somewhat related, I guess. JMarc From: Kevin Atkinson <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 Jean-Marc Lasgouttes wrote: > We had plans for multiple ispell processes, but the reason was rather Having multiple ispell processes with the same language will cause problems with their personal dictionaries because when ispell saves its personal dictionary it simply writs the information to disk. If the personal dictionary changes sense the process started it will over right the changes. This means that if you have two ispell process and in both processes the personal dictionary was changed only one of the two modified personal dictionaries will be saved to disk because the two ispell processes are unaware of each other and will blindly over right the changes the other one made. How do you plan on dealing with this? From: Lars Gullik Bj�nnes <[EMAIL PROTECTED]> To: Kevin Atkinson <[EMAIL PROTECTED]> Cc: Garst R. Reese <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 with some hacking we can make it _almost_ foolproof. But for ispell processes spawned from outside lyx there will stil be a prob. From: Jean-Marc Lasgouttes <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Aspell and LyX 1.1 Kevin> Having multiple ispell processes with the same language will Are we absolutely forced to offer the possibility of spellchecking two documents at the same time? It does not look like a feature I'd be killing for. From: Kevin Atkinson <[EMAIL PROTECTED]> To: "Lars Gullik [iso-8859-1] Bj�nnes" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: Aspell LyX 1.1 and Exceptions "Lars Gullik Bj�nnes" wrote: > > KA> Now do you want the LyX spell checker interface to also throw > KA> exceptions (with all of them steaming form a common base class > KA> such as lyx_spell_error) or do you rather it catch all thrown > KA> exceptions and toggle an error flag or something similar. > > KA> It doesn't really make a difference. If you wish for it to throw > KA> exceptions it will also throw them when an ispell process > KA> returns an error code. > > If we allow it to use exceptions we limit the range of usable > compilers a great deal. Then we can throw out support for gcc 2.7.x at > once. > Ok than I take it you don't wan to use exceptions as you still wish to support gcc 2.7.x? Thats fine. In that case aspell support will only be compiled in if a comptable compiler is used such as egcs... Otherwise it will use ispell.... Date: Sat, 13 Mar 1999 02:04:54 -0500 From: Kevin Atkinson <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: LyX New Spell Checker Interface Here is a rough outline of what I had in mind for LyX new Spell Checker. Let me know what you think. I will right the interface to aspell and rely on someone else to write the interface for Ispell. Sorry I took so long to write something out. class SpellChecker; class DictManager { public: void add_sc(SpellChecker *); void remove_sc(SpellChecker *); SC_Error save_wls(); //save all word lists; // a bunch of other dictionary management methods }; class SpellChecker { typedef Itr ... // A forward (but preferably bidirectional) iterator typedef EndItr ... // n iterator such that if i an Itr and e is an EndItr // i == e if and only if i is at the end of the iterator // range. This can be the same as Itr and in most cases // it probably is. public: SpellChecker(DictManager *); void set_language(const string &lang); string language() const; void restart(Itr c, EndItr e); // starts or restarts the process with a new iterator pair. Will stop // when it encounters a misspelling or reaches the end. // If spell checking has already started c must be within one word of // where the spell checker stops. If you need to skip over an area // use the scan method. void skip_word(); // skip past the current mispelled word void continue(); // continues the process. Will stop when it enconters a misspelling string word() const; // returns the misspelled word. Itr word_begin() const; // returns an iterator pointing to the beggining of the misspelled // word Itr word_end() const; // returns an iterator pointing to the end of the misspelled word bool at_end() const; // returns true if the spell checker reached the end of the iterator // range. void scan_ahead(Itr stop); // skips to position stop gatering any nessary state information. void reset(); // Resets all state information. void scan(Itr begin, Itr stop, EndItr end); // Scans from begin until end gatering any nessary state // information. This has the potential of being much more efficent // if Itr is bidirectional // Note: In order to properly support some of Aspell advanced spell // checking modes it is important that you use the above three // methods to move around the document. Itr cur() const; // returns the location where the spell checker will resume checking EndItr end() const; // returns the end iterator WordList suggestions(); // returns a list of suggestions for the current word void add_personal(const string &word); // add a word to the personal word list void add_session(const string &word); // add a word to the session or "ignore all" word list void save_all_wls(); // save all relevent word lists void clear_session(); // clear the session word list bool ignore_replacements(); bool ignore_replacements(bool); void store_replacement(const string &cor, bool memory = true); // if ignore_replacements is not set return store the replacement pair // the memory parameter should be ignored }; To give you an idea of how the spell checker works lets assume we are spell checking this paragraph: This is a stupid exampe as I, Kevin Atkinson, cant think of intelligent to say. Anothr stupid. The closing sentence. Let sc be the spell checker method, and i be an iterator pointing to thge beginning of the paragraph, and e be the end. We first need to reset the spell checker sc.reset(); No we need to start it at the beginning. sc.restart(i, e); No we need to find where we are so i = sc.word_begin(); Ah we are at "exampe". So lets get some suggestions. sugs = sc.suggestions(); Ok we wan't to replace it with example so we educate the spell checker of out choice sc.store_replacement("example"); And then we make the replacement in the document. However doing this replacement invalidated the iterator so we need to restart the scan with. sc.restart(i, e); Where i is the location right before "example". Now the next misspelling is "Atkinson" However we wan't to ignore this word so we simply skip over it sc.skip_word(); and continue on sc.continue(); Now it stops at "Anothr". However the user say this sentense was a fragment and just want's to skip over it. So we set i to the end of the sentence and use sc.scan_ahead(i); Even though you could just restart at i you shouldn't as Aspell might need to gather some state information along the way (this is a poor example, A better example would be restrting in the middle of s sentence or such). So in order to skip over a region always use scan_ahead. Now we continue on. sc.contunue(); We don't use restart becuase cur() is already at where we wan't to start from the scan_ahead method. And we discover that sc.at_end() is true so we stop. Date: Sun, 14 Mar 1999 14:31:14 +0100 From: Asger Alstrup Nielsen <[EMAIL PROTECTED]> To: Kevin Atkinson <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface > Here is a rough outline of what I had in mind for LyX new Spell > Checker. Thank you for taking the time to do this! > Let me know what you think. I will right the interface to > aspell and rely on someone else to write the interface for Ispell. Ok, that sounds fair to me. I have some comments to the code, and then some comments to the design as such. [Note: comments on stylistic, such as naming, things removed] What is this DictionaryManager used for? > typedef Itr ... // A forward (but preferably bidirectional) iterator > typedef EndItr ... // n iterator such that if i an Itr and e is an EndItr I don't think you should distinguish between the two kinds of iterators. An iterator can point to any element in the list, and at the end of the list. The spellchecker should work on strings. The LyX data structure will be small strings, so you should simply use LString::const_iterator to make life simplest for us. > void set_language(const string &lang); > string language() const; I think we need methods to define the encoding of the data. LyX can provide MIME-type encoding flags for the spell checker. This might be needed in the future, so add: void set_encoding(string const & encoding); string encoding() const; > void continue(); > // continues the process. Will stop when it enconters a misspelling > > string word() const; > // returns the misspelled word. > Itr word_begin() const; > // returns an iterator pointing to the beggining of the misspelled word > Itr word_end() const; > // returns an iterator pointing to the end of the misspelled word Maybe you could use a pair instead: /// Returns a range that surrounds the misspelled word pair<LString::const_iterator start, LString::const_iterator end> word_boundary() const; > bool at_end() const; > // returns true if the spell checker reached the end of the iterator range. Since later you present the "cur" and "end" methods, there is no need for this one. The user can just do "if (spellchecker.current() == spellchecker.end()) { .. }" > void reset(); What is this used for? > void scan(Itr begin, Itr stop, EndItr end); What is the purpose of this method "scan"? > bool ignore_replacements(); What is this ignore_replacement state? The design presents an interface on words. I think the design is fairly complete. However, the stuff about scanning, skipping and all that is a bit complicated. Why is this needed? In general, I prefer to have a minimal interface. The one you present has many methods that overlap. We should try to cut these to a minimum. Also, the language and encoding of a spellchecker is probably fixed. Can a spell checker change disctionary? I don't think ispell can, so we can't assume that. So I think we can get away with just passing these in the constructor of the spellchecker: SpellChecker(string const & language, string const & encoding); We need to standardize the language strings. Maybe we should use the two letter ISO codes (us, de, dk, au, en)? We don't need access methods for asking the spellchecker which language and encoding it is. So based on your design, here is my proposal: class SpellChecker { /** Create a spellchecker with given language and encoding. Also, a bunch of spell checker specific parameters can be specified. */ SpellChecker(string const & language, string const & encoding, string const & parameters); /** What is the status of the spell checker? Before you use a spell checker, you have to make sure that it is ok. The user might not have any spell checker installed, so we have to return an error string in this case. If the spell checker is ok, we return an empty string here. */ string errorStatus() const; /** Define the string the spell checker should work on. If it already is started, "start" must be within one word of where the spell checker stopped last time. If you need to skip over an area of the string, use the moveTo method. */ void set_string(LString::const_iterator start, LString::const_iterator end); /** Starts or restarts the spell checking. The spell checker will stop when it encounters a misspelling or reaches the end of the string. */ void spellcheck(); /// Skip the current word void skipWord(); /** Return the current misspelled word. If we reached the end of the string, the returned string is empty. */ LString word() const; /// Returns a range that surrounds the misspelled word pair<LString::const_iterator, LString::const_iterator> word_boundary() const; /** Skips to given position. This method is necessary because the spell checker might gather state information. */ void moveTo(LString::constr_iterator position); /// Returns the location where the spell checker is in the string LString::const_iterator current() const; /// Returns the location where the spell checker will end LString::const_iterator end() const; /// Returns a list of suggestions for the current word vector<LString const &> suggestions(); /// Add a word to the personal word list void addPersonal(LString const & word); /// Add a word to the session or "ignore all" word list void addSession(LString const & word); /// Add a global replacement void addReplacement(LString const & word, LString const & replacement); /// Save all relevant word lists, replacements, etc. void save(); }; Date: Sun, 14 Mar 1999 16:31:31 -0500 From: Kevin Atkinson <[EMAIL PROTECTED]> To: Asger Alstrup Nielsen <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface Asger Alstrup Nielsen wrote: > What is this DictionaryManager used for? To avoid having to have seperate master word lists in memory when using multiple documents amoung other things. > The spellchecker should work on strings. The LyX data structure will be small Fine than make the typedef for Itr and EndItr to a LString::const_iterator. Also my spell checker doesn't like it very much when you break things up. How about making an iterator class that will automatically treat multiple strings as if they where one unit. It will not be that difficult and it may actually make things simpler for you. The have my reasons for EndItr however they are not that important. > I think we need methods to define the encoding of the data. Yes I forgot about that. > Maybe you could use a pair instead: Ok if you like it better. > > bool at_end() const; > > Since later you present the "cur" and "end" methods, there is no need for this That is long and ugly in my view. > > void reset(); > > What is this used for? Needed when you restart the spell checker at the beggining. Natually all state information should be reset. > > void scan(Itr begin, Itr stop, EndItr end); > > What is the purpose of this method "scan"? When you want to restart the spell checker in the middle of the document. > What is this ignore_replacement state? It is not really needed. It is used when you for some reason don't want to store replacement pairs. > complete. However, the stuff about scanning, skipping and all that is a bit Ok. Suppose you have a spell checker mode when you spell check all comments of you code: /* This is a sample block to spell check. And here is another sentence. etc... */ int main() { cout << "Hellow Word\n"; } Now if the spell checker start at say "another" is this in a document how does it now if it is in a comment? It doesn't. It has to scan from the begginning for the /* string. > In general, I prefer to have a minimal interface. The one you present has many Which ones? Other than the is_end? Some of those methods are there for speed and flexibility. > Also, the language and encoding of a spellchecker is probably fixed. Can a That is what the dictionary manager class is. > So I think we can get away with just passing these in the constructor of the Fine if you don't want to one day support multilingual documents. It should be fairly eazy to pull this of with both ispell and aspell with the help of the dictionary management class. > We need to standardize the language strings. Maybe we should use the two Could you give me a refrence? > We don't need access methods for asking the spellchecker which language and They don't do any harm. And also I may what this interface to work with other projects. > So based on your design, here is my proposal: Use Typedefs instead of hardcoding LString everwhere! I will budge on other things but NOT this! > class SpellChecker { > /** Create a spellchecker with given language and encoding. > Also, a bunch of spell checker specific parameters can > be specified. */ > SpellChecker(string const & language, string const & encoding, > string const & parameters); See my note above. Putting all this in the constructer loses flexibility. The rest has some problems. But most of them are covered above. Date: Sun, 14 Mar 1999 20:12:30 -0500 From: Kevin Atkinson <[EMAIL PROTECTED]> To: Asger Alstrup Nielsen <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface Asger Alstrup Nielsen wrote: > The spellchecker should work on strings. The LyX data structure will be small Only giving the Spell Checker small strings at a time will work for a simple minded spell checker that only want to look at a document a word at a time however it will not work for spell checkers that want to be able to see the entire document at once. Aspell is going to eventually want to see the entire document at once in order to support some advance skipping and suggestion algorithms. Two such algorithms include Word skipping by context (see http://franklin.oit.unc.edu/cgi-bin/lyris.pl?visit=aspell&id=79941057) and suggesting close matches that exist elsewhere in the document before looking for matches in the dictionary. Now, in order for these algorithm to work Aspell will need to be able to first have a prescan of the document to build a database of the words in the document. In order to get this prescan Aspell will need to iterate to the end of the document before it returns anything. If you only give it small segments of the document at once there is no way aspell can do this. Thus Aspell needs a continuous iterator that will represent the entire document. Writhing such an iterator is not that difficult providing you have some sort of container where are the individual little strings are held: class doc_iterator { private: typedef ... StringCollectionItr; typedef StringCollectionItr::iterator StringItr; // StringCollectionItr is an bidirectional iterator of pointers to strings StringCollectionItr strs_begin_; StringCollectionItr strs_cur_; StringCollectionItr strs_end_; StringItr begin_; StringItr cur_; StringItr end_; public: doc_iterator(StringCollectionItr bbegin, StringCollectionItr eend); jump_to(StringCollectionItr strs_current, StringItr current); doc_iterator& operator++() { ++cur_; if (cur_ == end_ && strs_cur_ != strs_end_) { ++strs_cur_; begin_ = strs_cur->begin(); cur_ = begin_; end_ = strs_cur->end(); } return *this; } doc_iterator operator++(int) { doc_iterator temp = *this; operator++; return temp; } doc_iterator& operator--() { if (cur_ == begin_ && strs_cur_ != strs_begin_) { --strs_cur_; begin_ = strs_cur_->begin(); end_ = strs_cur_->end(); cur_ = end_; --cur_; } else { --cur_; } return *temp; } doc_iterator operator--(int) { doc_iterator temp = *this; operator--; return temp; } const char & operator*() const {return *cur_;} StringCollectionItr string_collection_iterator() const {return strs_cur_;} StringItr string_iterator() const {return cur_;} }; And when ever you need to find the actual location all you need to do is call the string_collection_iterator() or string_iterator() methods. However this does not take into consideration the fact that you may need to provide space between the strings if you strings are like this. "This is a" "dog jumping over a fence." However that is also quite possible to do. It just won't be as simple and you willFrom: Asger K. Alstrup Nielsen <[EMAIL PROTECTED]> Date: Mon, 15 Mar 1999 10:30:53 +0100 (MET) To: Kevin Atkinson <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface > Only giving the Spell Checker small strings at a time will work for a I see and acknowledge the relevance of this. > Writhing such an iterator is not that difficult providing you have Yes, this is a good solution, and very useful in other situations as well. Consider the LyX document one giant string from now on. have to return "char" instead of "const char &" when the iterator is dereferenced. Date: 15 Mar 1999 18:37:31 +0100 From: Lars Gullik Bj�nnes <[EMAIL PROTECTED]> To: Kevin Atkinson <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface KA> "Lars Gullik Bj�nnes" wrote: >> Is it Aspell that needs it, or your interface to aspell? KA> What Aspell Needs. Well it doesn't really need it it is just KA> that it has the potential to do a better job with access to the KA> entire document at once. I am not convinced that it is possible/easy to see a complete lyx buffer/document as a long string with context, that would almost be similar to write out the lyx file, spellcheck that and reload. Wouldn't it be better to use the context the insets provide more directly? Why see more context than a complete paragraph at a time? sometimes I belive that a singel inset provides all the context you need too. Date: Mon, 15 Mar 1999 17:45:11 -0500 From: Kevin Atkinson <[EMAIL PROTECTED]> To: "Lars Gullik [iso-8859-1] Bj�nnes" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface "Lars Gullik Bj�nnes" wrote: > I am not convinced that it is possible/easy to see a complete lyx No. Pay close attention to the iterator model I gave you. It doesn't copy a single thing it just iterator over multiple strings as IF there where one. It never every makes a copy of anything except perhaps a single character. You can always find out where you as it returns iterator pointing to the real inset and location within that inset it is in. > Wouldn't it be better to use the context the insets provide more That would be more complicated. I am trying to avoid a lot of LyX specific code. > Why see more context than a complete paragraph at a time? sometimes I Not necessarily. That is the same philosophy as two digits for the date is more than enough. Or 640K of memory if more than any one needs. Or, Why would anyone need a color monitor. Date: Tue, 16 Mar 1999 00:36:23 +0100 (MET) From: Asger K. Alstrup Nielsen <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface > I am not convinced that it is possible/easy to see a complete lyx The difference is that the iterator can perform the reverse lookup: Given a position in the "long string", we can actually find the exact spot in the original data structure in constant time. We need this in order to high-light the misspelled words... This is not easy if we export the lot to a file, and then reload: We will not be able to map the positions in the file to the positions in the document representation data structure. > Wouldn't it be better to use the context the insets provide more Since I suggest that different font should be in different insets, the context of an inset can be very small. I personally considered to just present each paragraph as one string with an iterator like the other one, but the added complexity of exposing the entire document as one string is very small. Basically, Kevin presented the code that is needed. Notice that the cost of this kind of iterator is very small: We just need to hold an BufferIterator inside it. Thus, the memory usage is minimal, the performance is optimal, and the semantics are crystal clear. Also, all the necessary code can be contained in one header file, so why not do it? It enables Aspell to use some advanced spell checking routines. > Why see more context than a complete paragraph at a time? sometimes I As mentioned, I imagine that a paragraph will be made up from a bunch of small insets. So we need some means to collapse all these Also, having this interface will make search-replace trivial: We can simply use the STL find, and the STL replace. One comment to Kevin: Notice that we need to handle the issue of wide strings... In LyX 1.1, the LString is going to be compile-time variable: Either char or wchar_t. We need to handle this in some way... Date: Wed, 24 Mar 1999 04:00:17 -0500 From: Kevin Atkinson <[EMAIL PROTECTED]> To: Asger K. Alstrup Nielsen <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: LyX New Spell Checker Interface "Asger K. Alstrup Nielsen" wrote: > Notice that we need to handle the issue of wide strings... In LyX 1.1, Except that can you really rely on wchar_t being really wide? From what I herd its best to use a type which you know the size and typedef it. Anyway being able to handle wide strings won't be a several issue. All it will take would be an iterator to translate the wide including into something 8-bit. The next aspell release will have code very similar to this. What type of encoding are your wide charters going to be? [The rest of this thread contains a bunch of communication back and forth between lyx developers about creating the iterator and changes in the inset design]
