Re: Count and write the number of occurences

Tim Chase Mon, 12 Oct 2009 14:39:14 -0700

> I want to check several new-articles on the occurences of words
> Therefore the two questions:
> 
> 1) How to let vim place a carriage return after each word in an news-article?


You can use

   :%s/\>/\r/g

though other non-word characters (defined by the 'iskeyword' 
setting) will end up leading lines.

> And 2) If you have a text like this
> 
> AAA
> AAA
> AAA
> BBB
> CCC
> CCC
> CCC
> CCC
> 
> can you let vim count the occurences of these words and add a number
> before them and delete all but the first one?

Have you already normalized the case of the words?

> So that the text above becomes:
> 
> 3 AAA
> 1 BBB
> 4 CCC

You might do a multi-step process such as:

   :%s/\>/\r/g    " break after all words
   :%s/\W//g      " remove all non-word characters
   :v/\w/d        " delete any lines that end up blank
   :%s/.*/\U&/    " uppercase the entire document to normalize
   :%sort         " sort the results as required by "uniq"
   :%!uniq -c     " use the *nix "uniq" tool to prepend counts

It gets a bit hairier if you need to do this on a non-*nix box 
where "uniq" isn't available.  You might be able to replace the 
"uniq" command with something like the following in pure Vim 
(requires vim7, as the Dict type wasn't introduced until vim7)

  :let d={} |g/^/let d[getline('.')]=get(d, getline('.'), 0)+1
  :%s/^\(.*\)\(\n\1\)*/\=d[submatch(1)]."\t".submatch(1)
  :unlet d

which builds a dictionary ("d") mapping word->count (the ":g" 
command) and then collapses multiple repeats into a single item 
prefixed by the count for that item (the ":%s" command), then 
ditches ("unlet"s) the dictionary to free up memory.

Hope this helps,

-tim



--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Re: Count and write the number of occurences

Reply via email to