Hi all I'm revising the function f_readfile in eval.c, to speed it up when processing very long lines. (It presently grows a string every 200 bytes by allocating a new one 200 bytes longer, copying the old to the new, and deallocating the new. F.ex., for a 1 MB line, such as may be used by the yank ring plug in, there's 5000 allocations and deallocations and about 5 GB of data copies.)
I've noted also that presently its handling of CR and bom removal fails if the characters are read in different calls to fread, so I'm fixing that. One can only decide that the utf-8 bom sequence EF BB BF is present if all three bytes have been read, so I was about to code a check when the BF is encountered, but it occurred to me that if BF is common in UTF-8 text, there'd be a lot of checking the previous bytes. So, how common is the byte BF in utf-8 text? How common are EF and BB? I've little idea. Perhaps someone on vim_dev has a better idea. Regards, John -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
