Re: A performance question (patch included)
On 5/25/07, Charles E Campbell Jr <[EMAIL PROTECTED]> wrote: John Beckett wrote: > A.J.Mechelynck wrote: > >> What about a different function to return, say, the number of >> 1K blocks (or the number of times 2^n bytes, with a parameter >> passed to the function) that a file uses? > > > Yes, that's a much more general and better idea. > > Since there's probably not much need for this, I think that > simplicity would be good. That is, have the function work in a > fixed way with no options. > > Re Dr.Chip's LargeFile script: It occurs to me that another > workaround would be to use system() to capture the output of > 'ls -l file' or 'dir file' (need an option for which). > > Then do some funky editing to calculate the number of digits in > the file length. If more than 9, treat file as large. > > I'm playing with a tiny utility to help the LargeFile script. > Bluesky: Its code (64-bit file size) could potentially be > incorporated in Vim. I'll post results in vim-dev. (I've moved this over to vim-dev) I've attached a patch to vim 7.1 which extends getfsize(); with the patch, getfsize() takes an optional second parameter which gives one the ability to specify a "unitsize". In other words, getfsize("eval.c") -> 478347 (after the patch) getfsize("eval.c",1000) -> 479 (truncated upwards) I'll be awaiting Bram's input before making use of this in LargeFile.vim ! Regards, Chip Campbell *** src/o_eval.c2007-05-25 08:52:12.0 -0400 --- src/eval.c 2007-05-25 09:04:43.0 -0400 *** *** 7094,7100 {"getcwd",0, 0, f_getcwd}, {"getfontname", 0, 1, f_getfontname}, {"getfperm", 1, 1, f_getfperm}, ! {"getfsize", 1, 1, f_getfsize}, {"getftime", 1, 1, f_getftime}, {"getftype", 1, 1, f_getftype}, {"getline", 1, 2, f_getline}, --- 7094,7100 {"getcwd",0, 0, f_getcwd}, {"getfontname", 0, 1, f_getfontname}, {"getfperm", 1, 1, f_getfperm}, ! {"getfsize", 1, 2, f_getfsize}, {"getftime", 1, 1, f_getftime}, {"getftype", 1, 1, f_getftype}, {"getline", 1, 2, f_getline}, *** *** 10135,10142 { if (mch_isdir(fname)) rettv->vval.v_number = 0; ! else rettv->vval.v_number = (varnumber_T)st.st_size; } else rettv->vval.v_number = -1; --- 10135,10151 { if (mch_isdir(fname)) rettv->vval.v_number = 0; ! else if (argvars[1].v_type == VAR_UNKNOWN) rettv->vval.v_number = (varnumber_T)st.st_size; + else + { + unsigned long unitsize; + unsigned long stsize; + unitsize= get_tv_number(&argvars[1]); + stsize= st.st_size/unitsize; + if(stsize*unitsize < st.st_size) ++stsize; + rettv->vval.v_number = (varnumber_T) stsize; + } } else rettv->vval.v_number = -1; *** runtime/doc/o_eval.txt 2007-05-25 09:00:08.0 -0400 --- runtime/doc/eval.txt2007-05-25 09:06:19.0 -0400 *** *** 1615,1621 getcmdtype() String return the current command-line type getcwd() String the current working directory getfperm( {fname})String file permissions of file {fname} ! getfsize( {fname})Number size in bytes of file {fname} getfontname( [{name}])String name of font being used getftime( {fname})Number last modification time of file getftype( {fname})String description of type of file {fname} --- 1615,1621 getcmdtype() String return the current command-line type getcwd() String the current working directory getfperm( {fname})String file permissions of file {fname} ! getfsize( {fname} [,unitsize])Number size in bytes of file {fname} getfontname( [{name}])String name of font being used getftime( {fname})Number last modification time of file getftype( {fname})String description of type of file {fname} *** *** 2819,2827 getcwd() The result is a String, which is the name of the current working directory. ! getfsize({fname}) *getfsize()* The result is a Number, which is the size in bytes of the given file {fname}. If {fname} is a directory, 0 is returned. If the file {fname} can't be found, -1 is returned. --- 2819,2829 getcwd() The result is a String, which is the name of the current working directory. ! getfsize({fname} [,unitsize]) *getfsize()* The result is a Number, which is the size in bytes of the given file
Re: A performance question (patch included)
Charles E Campbell Jr wrote: I've attached a patch to vim 7.1 which extends getfsize(); with the patch, getfsize() takes an optional second parameter which gives one the ability to specify a "unitsize". Things move quickly here ... I'm still polishing my code. However, I don't think it's going to be quite so easy to patch Vim to handle 64-bit file sizes on 32-bit architectures. On a 32-bit machine, the 'st.st_size' in your code will be a 32-bit integer and so will have the wrong file size. Using gcc, and if supported by the underlying system, it is possible to define an option so st.st_size is 64 bits, but a lot of thought would be needed to take care that the rest of Vim is not adversely affected. A minor point concerns the line: if(stsize*unitsize < st.st_size) ++stsize; On a 32-bit machine, stsize and unitsize will be 32 bits, so the multiply will give a 32-bit answer. If st.st_size were 64 bits, you would need a cast to a 64-bit integer on the left. I will put what I have in a new post. John
Re: A performance question (patch included)
Charles E Campbell Jr wrote: I'm also under the impression that "ls" itself uses fstat(), so its not likely to be any more informative. That's likely on some systems, but 'ls -l' gives correct results for files over 4GB on Fedora Core 6 using x86-32. John
Re: A performance question (patch included)
Yakov Lerner wrote: [...] stat() on Linux has 32-bit st_size field (off_t is 32-bit). There is stat64() syscall which uses 'struct stat64' structure where st_size is 64-bit. By defining __USE_LARGEFILE64 at compile-time, stat() is redirected to stat64(). I don't know whether default Linux vim build defines __USE_LARGEFILE64 or not. Yakov ":version" says: [...] Compilation: gcc -c -I. -Iproto -DHAVE_CONFIG_H -DFEAT_GUI_GTK -I/usr/include/cairo -I/usr/include/freetype2 -I/usr/include/libpng12 -I/opt/gnome/include/gtk-2.0 -I/opt/gnome/lib/gtk-2.0/include -I/opt/gnome/include/atk-1.0 -I/opt/gnome/include/pango-1.0 -I/opt/gnome/include/glib-2.0 -I/opt/gnome/lib/glib-2.0/include -DORBIT2=1 -pthread -I/usr/include/libart-2.0 -I/usr/include/cairo -I/usr/include/freetype2 -I/usr/include/libpng12 -I/usr/include/libxml2 -I/opt/gnome/include/libgnomeui-2.0 -I/opt/gnome/include/libgnome-2.0 -I/opt/gnome/include/libgnomecanvas-2.0 -I/opt/gnome/include/gtk-2.0 -I/opt/gnome/include/gconf/2 -I/opt/gnome/include/libbonoboui-2.0 -I/opt/gnome/include/gnome-vfs-2.0 -I/opt/gnome/lib/gnome-vfs-2.0/include -I/opt/gnome/include/gnome-keyring-1 -I/opt/gnome/include/glib-2.0 -I/opt/gnome/lib/glib-2.0/include -I/opt/gnome/include/orbit-2.0 -I/opt/gnome/include/libbonobo-2.0 -I/opt/gnome/include/bonobo-activation-2.0 -I/opt/gnome/include/pango-1.0 -I/opt/gnome/lib/gtk-2.0/include -I/opt/gnome/include/atk-1.0 -O2 -fno-strength-reduce -Wall -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/lib/perl5/5.8.8/i586-linux-thread-multi/CORE -I/usr/include/python2.5 -pthread -I/usr/include -D_LARGEFILE64_SOURCE=1 -I/usr/lib/ruby/1.8/i586-linux Linking: [...] so, maybe we'll have to check what happens when _LARGEFILE64_SOURCE is defined? I don't find a match in src/ or src/auto/. Best regards, Tony. -- If all these sweet young things were laid end-to-end, I wouldn't be a bit surprised. -- Dorothy Parker
Re: A performance question (patch included)
On 5/25/07, A.J.Mechelynck <[EMAIL PROTECTED]> wrote: Charles E Campbell Jr wrote: > A.J.Mechelynck wrote: > >> I'm not sure what varnumber_T means: will st.stsize (the dividend) be >> wide enough to avoid losing bits on the left? > > varnumber_T is int (long if an sizeof(int) <= 3). > > st.stsize 's size depends on whether 32bit or 64bit integers are available. > > So, its possible to lose bits: pick a small enough unitsize and a large > enough file, st.stsize will end up not being able > to fit into a varnumber_T. After all, unitsize could be 1, and > getfsize() will behave no differently than it does now. > However, unitsize could be 100. My patch divides st.stsize by the > unitsize first; presumably in whatever > arithmetic is appropriate for working with st.stsize. > > Regards, > Chip Campbell > Yes, yes, but before the division, will it be able to hold the file size? (sorry, I meant st.st_size) Will mch_stat (at line 10134, one line before the context of your patch) be able to return "huge" file sizes? stat() on Linux has 32-bit st_size field (off_t is 32-bit). There is stat64() syscall which uses 'struct stat64' structure where st_size is 64-bit. By defining __USE_LARGEFILE64 at compile-time, stat() is redirected to stat64(). I don't know whether default Linux vim build defines __USE_LARGEFILE64 or not. Yakov
Re: A performance question (patch included)
A.J.Mechelynck wrote: Yes, yes, but before the division, will it be able to hold the file size? (sorry, I meant st.st_size) Will mch_stat (at line 10134, one line before the context of your patch) be able to return "huge" file sizes? mch_stat is variously defined, depending on o/s. Under unix, that's the fstat function. This function returns a pointer to a struct stat; the member in question is: st_size. (off_t st_size;/* total size, in bytes */) So, st_size is an "off_t". Under linux, an "off_t" is typedef __kernel_off_toff_t So, I suspect that st_size will be sized by the o/s to handle whatever size files it can handle. Someone with a 64-bit machine, perhaps, could examine this further? BTW, I'm also under the impression that "ls" itself uses fstat(), so its not likely to be any more informative. Regards, Chip Campbell
Re: A performance question (patch included)
Charles E Campbell Jr wrote: A.J.Mechelynck wrote: I'm not sure what varnumber_T means: will st.stsize (the dividend) be wide enough to avoid losing bits on the left? varnumber_T is int (long if an sizeof(int) <= 3). st.stsize 's size depends on whether 32bit or 64bit integers are available. So, its possible to lose bits: pick a small enough unitsize and a large enough file, st.stsize will end up not being able to fit into a varnumber_T. After all, unitsize could be 1, and getfsize() will behave no differently than it does now. However, unitsize could be 100. My patch divides st.stsize by the unitsize first; presumably in whatever arithmetic is appropriate for working with st.stsize. Regards, Chip Campbell Yes, yes, but before the division, will it be able to hold the file size? (sorry, I meant st.st_size) Will mch_stat (at line 10134, one line before the context of your patch) be able to return "huge" file sizes? Best regards, Tony. -- Real Programmers don't play tennis, or any other sport that requires you to change clothes. Mountain climbing is OK, and real programmers wear their climbing boots to work in case a mountain should suddenly spring up in the middle of the machine room.
Re: A performance question (patch included)
A.J.Mechelynck wrote: I'm not sure what varnumber_T means: will st.stsize (the dividend) be wide enough to avoid losing bits on the left? varnumber_T is int (long if an sizeof(int) <= 3). st.stsize 's size depends on whether 32bit or 64bit integers are available. So, its possible to lose bits: pick a small enough unitsize and a large enough file, st.stsize will end up not being able to fit into a varnumber_T. After all, unitsize could be 1, and getfsize() will behave no differently than it does now. However, unitsize could be 100. My patch divides st.stsize by the unitsize first; presumably in whatever arithmetic is appropriate for working with st.stsize. Regards, Chip Campbell
Re: A performance question (patch included)
Nikolai Weibull wrote: On 5/25/07, Charles E Campbell Jr <[EMAIL PROTECTED]> wrote: getfsize("eval.c") -> 478347 (after the patch) getfsize("eval.c",1000) -> 479 (truncated upwards) Why can't this be done in VimScript? Consider a 3GB file; getfsize("ThreeGBfile") will return a negative number (not -1, either). Regards, Chip Campbell
Re: A performance question (patch included)
Charles E Campbell Jr wrote: John Beckett wrote: A.J.Mechelynck wrote: What about a different function to return, say, the number of 1K blocks (or the number of times 2^n bytes, with a parameter passed to the function) that a file uses? Yes, that's a much more general and better idea. Since there's probably not much need for this, I think that simplicity would be good. That is, have the function work in a fixed way with no options. Re Dr.Chip's LargeFile script: It occurs to me that another workaround would be to use system() to capture the output of 'ls -l file' or 'dir file' (need an option for which). Then do some funky editing to calculate the number of digits in the file length. If more than 9, treat file as large. I'm playing with a tiny utility to help the LargeFile script. Bluesky: Its code (64-bit file size) could potentially be incorporated in Vim. I'll post results in vim-dev. (I've moved this over to vim-dev) I've attached a patch to vim 7.1 which extends getfsize(); with the patch, getfsize() takes an optional second parameter which gives one the ability to specify a "unitsize". In other words, getfsize("eval.c") -> 478347 (after the patch) getfsize("eval.c",1000) -> 479 (truncated upwards) I'll be awaiting Bram's input before making use of this in LargeFile.vim ! Regards, Chip Campbell I'm not sure what varnumber_T means: will st.stsize (the dividend) be wide enough to avoid losing bits on the left? Best regards, Tony. -- 'I generally avoid temptation unless I can't resist it." -- Mae West
Re: A performance question (patch included)
On 5/25/07, Charles E Campbell Jr <[EMAIL PROTECTED]> wrote: getfsize("eval.c") -> 478347 (after the patch) getfsize("eval.c",1000) -> 479 (truncated upwards) Why can't this be done in VimScript? nikolai
Re: A performance question (patch included)
John Beckett wrote: A.J.Mechelynck wrote: What about a different function to return, say, the number of 1K blocks (or the number of times 2^n bytes, with a parameter passed to the function) that a file uses? Yes, that's a much more general and better idea. Since there's probably not much need for this, I think that simplicity would be good. That is, have the function work in a fixed way with no options. Re Dr.Chip's LargeFile script: It occurs to me that another workaround would be to use system() to capture the output of 'ls -l file' or 'dir file' (need an option for which). Then do some funky editing to calculate the number of digits in the file length. If more than 9, treat file as large. I'm playing with a tiny utility to help the LargeFile script. Bluesky: Its code (64-bit file size) could potentially be incorporated in Vim. I'll post results in vim-dev. (I've moved this over to vim-dev) I've attached a patch to vim 7.1 which extends getfsize(); with the patch, getfsize() takes an optional second parameter which gives one the ability to specify a "unitsize". In other words, getfsize("eval.c") -> 478347 (after the patch) getfsize("eval.c",1000) -> 479 (truncated upwards) I'll be awaiting Bram's input before making use of this in LargeFile.vim ! Regards, Chip Campbell *** src/o_eval.c2007-05-25 08:52:12.0 -0400 --- src/eval.c 2007-05-25 09:04:43.0 -0400 *** *** 7094,7100 {"getcwd",0, 0, f_getcwd}, {"getfontname", 0, 1, f_getfontname}, {"getfperm", 1, 1, f_getfperm}, ! {"getfsize", 1, 1, f_getfsize}, {"getftime", 1, 1, f_getftime}, {"getftype", 1, 1, f_getftype}, {"getline", 1, 2, f_getline}, --- 7094,7100 {"getcwd",0, 0, f_getcwd}, {"getfontname", 0, 1, f_getfontname}, {"getfperm", 1, 1, f_getfperm}, ! {"getfsize", 1, 2, f_getfsize}, {"getftime", 1, 1, f_getftime}, {"getftype", 1, 1, f_getftype}, {"getline", 1, 2, f_getline}, *** *** 10135,10142 { if (mch_isdir(fname)) rettv->vval.v_number = 0; ! else rettv->vval.v_number = (varnumber_T)st.st_size; } else rettv->vval.v_number = -1; --- 10135,10151 { if (mch_isdir(fname)) rettv->vval.v_number = 0; ! else if (argvars[1].v_type == VAR_UNKNOWN) rettv->vval.v_number = (varnumber_T)st.st_size; + else + { + unsigned long unitsize; + unsigned long stsize; + unitsize= get_tv_number(&argvars[1]); + stsize= st.st_size/unitsize; + if(stsize*unitsize < st.st_size) ++stsize; + rettv->vval.v_number = (varnumber_T) stsize; + } } else rettv->vval.v_number = -1; *** runtime/doc/o_eval.txt 2007-05-25 09:00:08.0 -0400 --- runtime/doc/eval.txt2007-05-25 09:06:19.0 -0400 *** *** 1615,1621 getcmdtype() String return the current command-line type getcwd() String the current working directory getfperm( {fname})String file permissions of file {fname} ! getfsize( {fname})Number size in bytes of file {fname} getfontname( [{name}])String name of font being used getftime( {fname})Number last modification time of file getftype( {fname})String description of type of file {fname} --- 1615,1621 getcmdtype() String return the current command-line type getcwd() String the current working directory getfperm( {fname})String file permissions of file {fname} ! getfsize( {fname} [,unitsize])Number size in bytes of file {fname} getfontname( [{name}])String name of font being used getftime( {fname})Number last modification time of file getftype( {fname})String description of type of file {fname} *** *** 2819,2827 getcwd() The result is a String, which is the name of the current working directory. ! getfsize({fname}) *getfsize()* The result is a Number, which is the size in bytes of the given file {fname}. If {fname} is a directory, 0 is returned. If the file {fname} can't be found, -1 is returned. --- 2819,2829 getcwd() The result is a String, which is the name of the current working directory. ! getfsize({fname} [,unitsize]) *getfsize()* The result is a Number, which is the size in bytes of the given file {fname}. + If unitsize is given, then the file {fname}'s size will be +