Package: wordnet Severity: serious Forwarded from launchpad by Sundaram Ramaswamy LP: #305407
--------------------------------------------------------------- Hi, I am working on Wordnet for a particular project. I installed Wordnet in Ubuntu via Synaptic, the latest package to date. Tried searching "automata" in the Wordnet Browser (bash-command: wnb), it returned 0 results, while the installed Wordnet in Windows (installer from Wordnet's site) shows a couple of definitions for "automata". In fact, the latest version of Wordnet for Windows is just 2.1 while Linux's is 3.0. Bascially, Wordnet's function morphstr() is supposed to give the root words for a given inflected word. For example, when "knifes" is given to morphstr, it returns "knife". Likewise for "axes" it should return "ax", "axe" and "axis". It first searches an exceptions list file (because of peculiar cases like axes), when it has an entry in it, it returns the file's results. If not found in the list, it tries to predict the root. While the prediction part (e.g. knifes) works fine in Ubuntu, the search from file part doesn't (e.g. axes, automata, etc.) When I compared the source code of Wordnet (morph.c of Windows and Linux), its the same for both the OSs (they have just used preprocessor switches for the differences). This needs to be fixed from our side, since Wordnet's source code doesn't have any errors/diffs, as the same code is present on both the OSs. The Windows installer was packaged by Wordnet guys themselves, while the deb was packaged from their source by someone of Ubuntu/Deb guys, I guess. PS: When I wrote my own code, and tried using morphstr(), I could spot the error with Ubuntu's packaged wordnet.lib. The problem is that, morphstr takes two args; 1: inflected word, 2: POS (Part of Speech - NOUN, VERB, etc.) E,g. morphstr("knifes", NOUN); will return "knifes" using the prediction technique (works right in Ubuntu). When I call morphstr("automata", NOUN) it returns NULL but when I call morphstr("automata", NOUN - 1); it returns "automata". Likewise, for any word, which has an exception in the exception list file, when we pass the actual POS value minus 1, we get the proper values. It has some array indexing issue, I believe. The reason why Wordnet Browser doesn't show "automata"'s definitions in Linux is that morphstr() when called with proper POS value returns NULL, while in Windows, it returns correct values for the same set of arguments, so Wordnet Browser in Windows shows it. ------------------------------------------------------------- I notice that 51_overflow.patch modify the index while it is not processed correctly and it is also not needed. The attacement is the new 51_overflow.patch with some hooks droped. It works well now. -- YunQiang
51_overflows.patch
Description: Binary data