Le mar. 30 juin 2020 à 20:04, Linas Vepstas <linasveps...@gmail.com> a écrit : > > Hi Amirouche, > > There are other, far more sophisticated ways of doing spell checking. The > best way is to do context-dependent checking... and link-grammar provides > extremely precise context. For example, "I gave him teh hammer" -- the LG > spelling guesser offers up "the", "then", "ten" as possible fixes. However, > only one of these choices leads to a grammatically-correct sentence. Thus, > the other spelling guesses can be discarded because they lead to grammatical > nonsense.
Exactly! That is my goal. > Because of this, spelling checkers and POS taggers are NOT used in the > language pipeline, and, based on practical experience, they actually make > things worse. That is, they make suggestions and provide tags that are > misleading or wrong, and lower the quality of the results -- It turns out > that English grammar provides tight constraints on what is possible, and > those constraints are much tighter (and have higher > accuracy/recall/precision) than single-word taggers. > > Such statistical systems might currently do as well as or better than grammar, They will always fail on new grammar rules, and re-training the algorithm over those is painful I guess. Maybe new grammar rules is not a use-case since nowadays there is grammar-police-robot in every browser... > since, it turns out, writing a complete grammar is impossible; there is > always yet one more, extremely rare exception to every rule. Roughly > speaking, grammar rules have a Zipfian distribution. I mean probably yes with the approach taken by link-grammar where only a couple of experts can edit the grammar. My goal is to make it also easy for the user to improve the grammar, hence the single source of truth database that will include the dictionary of words, grammar relations between words, and test cases for each relation. I am still nurturing the idea to allow a user as part of the conversational interface, to make it possible to extend the grammar based on a subset of English grammar that is not ambiguous and that would be the seed of the system. I still need to create a proof-of-concept of this. > "A sat solver and an okvs" -- we've played with SAT. Basically, the SAT > solvers are slower. They used to be sometimes-faster, but Amir's work fixed > that. I don't know what OKVS is. How much slower ? 2 or 3 times slower ? or more like 10 or 100 times slower. My primary use for parsing sentences is to be able to have some kind of limited conversation with a human. In a _narrow_ first step, I do not plan to parse essays by Kant or Goethe for the time being. > p.s. LG is written in C, and so it can only use those spellling-guessers that > have a C api. The current choices are aspell and hunspell, -- once upon a > time, "hunspell" was "better" but seems to no longer be maintained. Again -- > aspell is used to provide suggestions, and parsing determines which of these > suggestions is correct (if any). Thanks for the enlightening conversation. I was imagining such a thing but did not have time to look into aspell intricates. I think a clean implementation of the ideas behind Link Grammar can be useful for OpenCog at least to help approach Link Grammar C/C++ codebase. You can compare my project to MINIX vs. Linux. MINIX is pedagogical. Linux is industrial. It seems to me learning grammar in an unsupervised way, is not ready? (A little bit unrelated to this topic: I think we already discussed that. My plan is to do most of the still fuzzy stuff in Scheme and re-use existing off-the-shelf libraries like minisat where it makes sense. I mean performance is very important but being able to comprehend the whole system is much more important for me. I think the example of spellchecking or candidate selection is a good example where one can fine-tune the accuracy and speed of the algorithm, but in my case, I prefer a simple algorithm that can possibly scale and that is easy to use instead of complex machinery that handles all the cases but is difficult to work with. I will learn patience :) Thanks! -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAL7_Mo9rgf_t4f5UUGd-iEBfPMtR-eY3TPYm6Ou_OaE8vRXUrg%40mail.gmail.com.