On Tuesday, November 18, 2014 2:37:22 AM UTC-8, Erik Christiansen wrote: > On 17.11.14 10:57, Graham Lawrence wrote: > > For my test file the awk program tagged some 3500 words, with 1960 of them > > unique, so this vim script must run within a loop to avoid the tedium and > > 4000 odd keystrokes required to invoke it individually for each unique > > error, > > Er, what script loop, and what "4000 odd keystrokes [per] error", if one > may be so bold?
A while loop to enclose the mapping as you saw it, I never add such details until I have the rest of the code working satisfactorily. Not 4000 keystrokes per error, ~4000 for all 1960 uniques errors with a 2 keystroke code to invoke the mapping. All of which is redundant now, as I realized I could cut it to one keystroke per error by splitting it into 2 mappings, which allowed eliminating the need for user input entirely, in which the 2nd mapping ends by reinvoking the first. > If the list of good words is read into an associative > array (lets call it "list") in the BEGIN action, and membership tested > with an "if (word in list) ..." in an unconditional action handling the > input stream, _and_ the unrecognised words (sans @@) are printed to > another file, then it is only necessary to open that file in vim, and > for each word (one per line), hit ":.w >> /path/goodfile" for each word > which we accept as good. With that aliased to a key of choice, only one > keystroke is required to qualify each word. Both awk and vim are run > once per session, handling thousands of words each time, if you have > them. Four thousand keystrokes would handle 4000 errors. In practice, it is not that straightforward. I'm not sure how awk organizes arrays internally, but I used a plain numeric index as I figured it must use an address array to reference the words array, and with a numeric index I could use a binary search pattern to locate the word. I think an associative array must use a linear search pattern because awk has no way of knowing if the array is actually in sequence. And of course, I have a second array of word suffixes to reference if the word of interest is not the root form. > > If these are e.g. ordinary English words, is it acceptable to read in > e.g. /usr/share/dict/british-english into "list", to start with 98,000 > or more good words in the BEGIN action, before reading in your list of > special words, Project Gutenberg provides Webster's Dictionary from about 1913. I extracted all the words from the html, and it reduced to about 200,000 unique words. I use arch and they don't include such refinements as dictionaries in their distro. > > Erik > (Who is doubtless glossing over some undeclared additional requirement. :) > > -- > Melbourne Water Use: > "More water is lost to stormwater each year than we use. On average we use > about 40 billion litres of water each year, and each year about 500 billion > litres runs into our drains." Leonie Duncan, Environment Victoria healthy > river > campaigner, quoted on p7 of Journal 21.10.08. -- -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_use" group. To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.