Re: A performance question (patch included)

2007-05-26 Thread Yakov Lerner

On 5/25/07, Charles E Campbell Jr <[EMAIL PROTECTED]> wrote:

John Beckett wrote:

> A.J.Mechelynck wrote:
>
>> What about a different function to return, say, the number of
>> 1K blocks (or the number of times 2^n bytes, with a parameter
>> passed to the function) that a file uses?
>
>
> Yes, that's a much more general and better idea.
>
> Since there's probably not much need for this, I think that
> simplicity would be good. That is, have the function work in a
> fixed way with no options.
>
> Re Dr.Chip's LargeFile script: It occurs to me that another
> workaround would be to use system() to capture the output of
> 'ls -l file' or 'dir file' (need an option for which).
>
> Then do some funky editing to calculate the number of digits in
> the file length. If more than 9, treat file as large.
>
> I'm playing with a tiny utility to help the LargeFile script.
> Bluesky: Its code (64-bit file size) could potentially be
> incorporated in Vim. I'll post results in vim-dev.


(I've moved this over to vim-dev)

I've attached a patch to vim 7.1 which extends getfsize(); with the
patch, getfsize() takes an optional
second parameter which gives one the ability to specify a "unitsize".
In other words,

getfsize("eval.c")  -> 478347 (after the patch)

getfsize("eval.c",1000)  -> 479   (truncated upwards)

I'll be awaiting Bram's input before making use of this in LargeFile.vim !

Regards,
Chip Campbell




*** src/o_eval.c2007-05-25 08:52:12.0 -0400
--- src/eval.c  2007-05-25 09:04:43.0 -0400
***
*** 7094,7100 
  {"getcwd",0, 0, f_getcwd},
  {"getfontname",   0, 1, f_getfontname},
  {"getfperm",  1, 1, f_getfperm},
! {"getfsize",  1, 1, f_getfsize},
  {"getftime",  1, 1, f_getftime},
  {"getftype",  1, 1, f_getftype},
  {"getline",   1, 2, f_getline},
--- 7094,7100 
  {"getcwd",0, 0, f_getcwd},
  {"getfontname",   0, 1, f_getfontname},
  {"getfperm",  1, 1, f_getfperm},
! {"getfsize",  1, 2, f_getfsize},
  {"getftime",  1, 1, f_getftime},
  {"getftype",  1, 1, f_getftype},
  {"getline",   1, 2, f_getline},
***
*** 10135,10142 
  {
if (mch_isdir(fname))
rettv->vval.v_number = 0;
!   else
rettv->vval.v_number = (varnumber_T)st.st_size;
  }
  else
  rettv->vval.v_number = -1;
--- 10135,10151 
  {
if (mch_isdir(fname))
rettv->vval.v_number = 0;
!   else if (argvars[1].v_type == VAR_UNKNOWN)
rettv->vval.v_number = (varnumber_T)st.st_size;
+   else
+   {
+   unsigned long unitsize;
+   unsigned long stsize;
+   unitsize= get_tv_number(&argvars[1]);
+   stsize= st.st_size/unitsize;
+   if(stsize*unitsize < st.st_size) ++stsize;
+   rettv->vval.v_number = (varnumber_T) stsize;
+   }
  }
  else
  rettv->vval.v_number = -1;
*** runtime/doc/o_eval.txt  2007-05-25 09:00:08.0 -0400
--- runtime/doc/eval.txt2007-05-25 09:06:19.0 -0400
***
*** 1615,1621 
  getcmdtype()  String  return the current command-line type
  getcwd()  String  the current working directory
  getfperm( {fname})String  file permissions of file {fname}
! getfsize( {fname})Number  size in bytes of file {fname}
  getfontname( [{name}])String  name of font being used
  getftime( {fname})Number  last modification time of file
  getftype( {fname})String  description of type of file {fname}
--- 1615,1621 
  getcmdtype()  String  return the current command-line type
  getcwd()  String  the current working directory
  getfperm( {fname})String  file permissions of file {fname}
! getfsize( {fname} [,unitsize])Number  size in bytes of file {fname}
  getfontname( [{name}])String  name of font being used
  getftime( {fname})Number  last modification time of file
  getftype( {fname})String  description of type of file {fname}
***
*** 2819,2827 
  getcwd()  The result is a String, which is the name of the current
working directory.

! getfsize({fname}) *getfsize()*
The result is a Number, which is the size in bytes of the
given file {fname}.
If {fname} is a directory, 0 is returned.
If the file {fname} can't be found, -1 is returned.

--- 2819,2829 
  getcwd()  The result is a String, which is the name of the current
working directory.

! getfsize({fname} [,unitsize]) *getfsize()*
The result is a Number, which is the size in bytes of the
given file

Re: A performance question (patch included)

2007-05-25 Thread John Beckett

Charles E Campbell Jr wrote:

I've attached a patch to vim 7.1 which extends getfsize();
with the patch, getfsize() takes an optional second parameter
which gives one the ability to specify a "unitsize".


Things move quickly here ... I'm still polishing my code.

However, I don't think it's going to be quite so easy to patch
Vim to handle 64-bit file sizes on 32-bit architectures.
On a 32-bit machine, the 'st.st_size' in your code will be a
32-bit integer and so will have the wrong file size.

Using gcc, and if supported by the underlying system, it is
possible to define an option so st.st_size is 64 bits, but a
lot of thought would be needed to take care that the rest of Vim
is not adversely affected.

A minor point concerns the line:
   if(stsize*unitsize < st.st_size) ++stsize;

On a 32-bit machine, stsize and unitsize will be 32 bits, so the
multiply will give a 32-bit answer. If st.st_size were 64 bits,
you would need a cast to a 64-bit integer on the left.

I will put what I have in a new post.

John



Re: A performance question (patch included)

2007-05-25 Thread John Beckett

Charles E Campbell Jr wrote:

I'm also under the impression that "ls" itself uses fstat(),
so its not likely to be any more informative.


That's likely on some systems, but 'ls -l' gives correct results
for files over 4GB on Fedora Core 6 using x86-32.

John



Re: A performance question (patch included)

2007-05-25 Thread A.J.Mechelynck

Yakov Lerner wrote:
[...]
stat() on Linux has 32-bit st_size field (off_t is 32-bit). There is 
stat64()

syscall which uses 'struct stat64' structure where st_size is 64-bit. By
defining __USE_LARGEFILE64 at compile-time, stat() is redirected to
stat64(). I don't know whether default Linux vim build defines
__USE_LARGEFILE64 or not.

Yakov



":version" says:

[...]
Compilation: gcc -c -I. -Iproto -DHAVE_CONFIG_H -DFEAT_GUI_GTK 
-I/usr/include/cairo -I/usr/include/freetype2 -I/usr/include/libpng12 
-I/opt/gnome/include/gtk-2.0 -I/opt/gnome/lib/gtk-2.0/include 
-I/opt/gnome/include/atk-1.0 -I/opt/gnome/include/pango-1.0 
-I/opt/gnome/include/glib-2.0 -I/opt/gnome/lib/glib-2.0/include   -DORBIT2=1 
-pthread -I/usr/include/libart-2.0 -I/usr/include/cairo 
-I/usr/include/freetype2 -I/usr/include/libpng12 -I/usr/include/libxml2 
-I/opt/gnome/include/libgnomeui-2.0 -I/opt/gnome/include/libgnome-2.0 
-I/opt/gnome/include/libgnomecanvas-2.0 -I/opt/gnome/include/gtk-2.0 
-I/opt/gnome/include/gconf/2 -I/opt/gnome/include/libbonoboui-2.0 
-I/opt/gnome/include/gnome-vfs-2.0 -I/opt/gnome/lib/gnome-vfs-2.0/include 
-I/opt/gnome/include/gnome-keyring-1 -I/opt/gnome/include/glib-2.0 
-I/opt/gnome/lib/glib-2.0/include -I/opt/gnome/include/orbit-2.0 
-I/opt/gnome/include/libbonobo-2.0 -I/opt/gnome/include/bonobo-activation-2.0 
-I/opt/gnome/include/pango-1.0 -I/opt/gnome/lib/gtk-2.0/include 
-I/opt/gnome/include/atk-1.0 -O2 -fno-strength-reduce -Wall 
-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING 
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 
-I/usr/lib/perl5/5.8.8/i586-linux-thread-multi/CORE  -I/usr/include/python2.5 
-pthread -I/usr/include  -D_LARGEFILE64_SOURCE=1  -I/usr/lib/ruby/1.8/i586-linux

Linking: [...]

so, maybe we'll have to check what happens when _LARGEFILE64_SOURCE is 
defined? I don't find a match in src/ or src/auto/.



Best regards,
Tony.
--
If all these sweet young things were laid end-to-end, I wouldn't be a
bit surprised.
-- Dorothy Parker


Re: A performance question (patch included)

2007-05-25 Thread Yakov Lerner

On 5/25/07, A.J.Mechelynck <[EMAIL PROTECTED]> wrote:

Charles E Campbell Jr wrote:
> A.J.Mechelynck wrote:
>
>> I'm not sure what varnumber_T means: will st.stsize (the dividend) be
>> wide enough to avoid losing bits on the left?
>
> varnumber_T is int (long if an sizeof(int) <= 3).
>
> st.stsize 's size depends on whether 32bit or 64bit integers are available.
>
> So, its possible to lose bits: pick a small enough unitsize and a large
> enough file, st.stsize will end up not being able
> to fit into a varnumber_T.  After all, unitsize could be 1, and
> getfsize() will behave no differently than it does now.
> However, unitsize could be 100.  My patch divides st.stsize by the
> unitsize first; presumably in whatever
> arithmetic is appropriate for working with st.stsize.
>
> Regards,
> Chip Campbell
>

Yes, yes, but before the division, will it be able to hold the file size?
(sorry, I meant st.st_size) Will mch_stat (at line 10134, one line before the
context of your patch) be able to return "huge" file sizes?


stat() on Linux has 32-bit st_size field (off_t is 32-bit). There is stat64()
syscall which uses 'struct stat64' structure where st_size is 64-bit. By
defining __USE_LARGEFILE64 at compile-time, stat() is redirected to
stat64(). I don't know whether default Linux vim build defines
__USE_LARGEFILE64 or not.

Yakov


Re: A performance question (patch included)

2007-05-25 Thread Charles E Campbell Jr

A.J.Mechelynck wrote:

Yes, yes, but before the division, will it be able to hold the file 
size? (sorry, I meant st.st_size) Will mch_stat (at line 10134, one 
line before the context of your patch) be able to return "huge" file 
sizes?


mch_stat is variously defined, depending on o/s.
Under unix, that's the fstat function.
This function returns a pointer to a struct stat; the member in question 
is: st_size.

(off_t st_size;/* total size, in bytes */)

So, st_size is an "off_t".

Under linux, an "off_t" is  typedef __kernel_off_toff_t

So, I suspect that st_size will be sized by the o/s to handle whatever 
size files it can handle.

Someone with a 64-bit machine, perhaps, could examine this further?

BTW, I'm also under the impression that "ls" itself uses fstat(), so its 
not likely to be any

more informative.

Regards,
Chip Campbell



Re: A performance question (patch included)

2007-05-25 Thread A.J.Mechelynck

Charles E Campbell Jr wrote:

A.J.Mechelynck wrote:

I'm not sure what varnumber_T means: will st.stsize (the dividend) be 
wide enough to avoid losing bits on the left?


varnumber_T is int (long if an sizeof(int) <= 3).

st.stsize 's size depends on whether 32bit or 64bit integers are available.

So, its possible to lose bits: pick a small enough unitsize and a large 
enough file, st.stsize will end up not being able
to fit into a varnumber_T.  After all, unitsize could be 1, and 
getfsize() will behave no differently than it does now.
However, unitsize could be 100.  My patch divides st.stsize by the 
unitsize first; presumably in whatever

arithmetic is appropriate for working with st.stsize.

Regards,
Chip Campbell



Yes, yes, but before the division, will it be able to hold the file size? 
(sorry, I meant st.st_size) Will mch_stat (at line 10134, one line before the 
context of your patch) be able to return "huge" file sizes?



Best regards,
Tony.
--
Real Programmers don't play tennis, or any other sport that requires
you to change clothes.  Mountain climbing is OK, and real programmers
wear their climbing boots to work in case a mountain should suddenly
spring up in the middle of the machine room.


Re: A performance question (patch included)

2007-05-25 Thread Charles E Campbell Jr

A.J.Mechelynck wrote:

I'm not sure what varnumber_T means: will st.stsize (the dividend) be 
wide enough to avoid losing bits on the left?


varnumber_T is int (long if an sizeof(int) <= 3).

st.stsize 's size depends on whether 32bit or 64bit integers are available.

So, its possible to lose bits: pick a small enough unitsize and a large 
enough file, st.stsize will end up not being able
to fit into a varnumber_T.  After all, unitsize could be 1, and 
getfsize() will behave no differently than it does now.
However, unitsize could be 100.  My patch divides st.stsize by the 
unitsize first; presumably in whatever

arithmetic is appropriate for working with st.stsize.

Regards,
Chip Campbell



Re: A performance question (patch included)

2007-05-25 Thread Charles E Campbell Jr

Nikolai Weibull wrote:


On 5/25/07, Charles E Campbell Jr <[EMAIL PROTECTED]> wrote:


getfsize("eval.c")  -> 478347 (after the patch)




getfsize("eval.c",1000)  -> 479   (truncated upwards)



Why can't this be done in VimScript?



Consider a 3GB file; getfsize("ThreeGBfile") will return a negative 
number (not -1, either).


Regards,
Chip Campbell



Re: A performance question (patch included)

2007-05-25 Thread A.J.Mechelynck

Charles E Campbell Jr wrote:

John Beckett wrote:


A.J.Mechelynck wrote:


What about a different function to return, say, the number of
1K blocks (or the number of times 2^n bytes, with a parameter
passed to the function) that a file uses?



Yes, that's a much more general and better idea.

Since there's probably not much need for this, I think that
simplicity would be good. That is, have the function work in a
fixed way with no options.

Re Dr.Chip's LargeFile script: It occurs to me that another
workaround would be to use system() to capture the output of
'ls -l file' or 'dir file' (need an option for which).

Then do some funky editing to calculate the number of digits in
the file length. If more than 9, treat file as large.

I'm playing with a tiny utility to help the LargeFile script.
Bluesky: Its code (64-bit file size) could potentially be
incorporated in Vim. I'll post results in vim-dev.



(I've moved this over to vim-dev)

I've attached a patch to vim 7.1 which extends getfsize(); with the 
patch, getfsize() takes an optional
second parameter which gives one the ability to specify a "unitsize".  
In other words,


getfsize("eval.c")  -> 478347 (after the patch)

getfsize("eval.c",1000)  -> 479   (truncated upwards)

I'll be awaiting Bram's input before making use of this in LargeFile.vim !

Regards,
Chip Campbell





I'm not sure what varnumber_T means: will st.stsize (the dividend) be wide 
enough to avoid losing bits on the left?



Best regards,
Tony.
--
'I generally avoid temptation unless I can't resist it."
-- Mae West


Re: A performance question (patch included)

2007-05-25 Thread Nikolai Weibull

On 5/25/07, Charles E Campbell Jr <[EMAIL PROTECTED]> wrote:


getfsize("eval.c")  -> 478347 (after the patch)



getfsize("eval.c",1000)  -> 479   (truncated upwards)


Why can't this be done in VimScript?

 nikolai


Re: A performance question (patch included)

2007-05-25 Thread Charles E Campbell Jr

John Beckett wrote:


A.J.Mechelynck wrote:


What about a different function to return, say, the number of
1K blocks (or the number of times 2^n bytes, with a parameter
passed to the function) that a file uses?



Yes, that's a much more general and better idea.

Since there's probably not much need for this, I think that
simplicity would be good. That is, have the function work in a
fixed way with no options.

Re Dr.Chip's LargeFile script: It occurs to me that another
workaround would be to use system() to capture the output of
'ls -l file' or 'dir file' (need an option for which).

Then do some funky editing to calculate the number of digits in
the file length. If more than 9, treat file as large.

I'm playing with a tiny utility to help the LargeFile script.
Bluesky: Its code (64-bit file size) could potentially be
incorporated in Vim. I'll post results in vim-dev.



(I've moved this over to vim-dev)

I've attached a patch to vim 7.1 which extends getfsize(); with the 
patch, getfsize() takes an optional
second parameter which gives one the ability to specify a "unitsize".  
In other words,


getfsize("eval.c")  -> 478347 (after the patch)

getfsize("eval.c",1000)  -> 479   (truncated upwards)

I'll be awaiting Bram's input before making use of this in LargeFile.vim !

Regards,
Chip Campbell



*** src/o_eval.c2007-05-25 08:52:12.0 -0400
--- src/eval.c  2007-05-25 09:04:43.0 -0400
***
*** 7094,7100 
  {"getcwd",0, 0, f_getcwd},
  {"getfontname",   0, 1, f_getfontname},
  {"getfperm",  1, 1, f_getfperm},
! {"getfsize",  1, 1, f_getfsize},
  {"getftime",  1, 1, f_getftime},
  {"getftype",  1, 1, f_getftype},
  {"getline",   1, 2, f_getline},
--- 7094,7100 
  {"getcwd",0, 0, f_getcwd},
  {"getfontname",   0, 1, f_getfontname},
  {"getfperm",  1, 1, f_getfperm},
! {"getfsize",  1, 2, f_getfsize},
  {"getftime",  1, 1, f_getftime},
  {"getftype",  1, 1, f_getftype},
  {"getline",   1, 2, f_getline},
***
*** 10135,10142 
  {
if (mch_isdir(fname))
rettv->vval.v_number = 0;
!   else
rettv->vval.v_number = (varnumber_T)st.st_size;
  }
  else
  rettv->vval.v_number = -1;
--- 10135,10151 
  {
if (mch_isdir(fname))
rettv->vval.v_number = 0;
!   else if (argvars[1].v_type == VAR_UNKNOWN)
rettv->vval.v_number = (varnumber_T)st.st_size;
+   else
+   {
+   unsigned long unitsize;
+   unsigned long stsize;
+   unitsize= get_tv_number(&argvars[1]);
+   stsize= st.st_size/unitsize;
+   if(stsize*unitsize < st.st_size) ++stsize;
+   rettv->vval.v_number = (varnumber_T) stsize;
+   }
  }
  else
  rettv->vval.v_number = -1;
*** runtime/doc/o_eval.txt  2007-05-25 09:00:08.0 -0400
--- runtime/doc/eval.txt2007-05-25 09:06:19.0 -0400
***
*** 1615,1621 
  getcmdtype()  String  return the current command-line type
  getcwd()  String  the current working directory
  getfperm( {fname})String  file permissions of file {fname}
! getfsize( {fname})Number  size in bytes of file {fname}
  getfontname( [{name}])String  name of font being used
  getftime( {fname})Number  last modification time of file
  getftype( {fname})String  description of type of file {fname}
--- 1615,1621 
  getcmdtype()  String  return the current command-line type
  getcwd()  String  the current working directory
  getfperm( {fname})String  file permissions of file {fname}
! getfsize( {fname} [,unitsize])Number  size in bytes of file {fname}
  getfontname( [{name}])String  name of font being used
  getftime( {fname})Number  last modification time of file
  getftype( {fname})String  description of type of file {fname}
***
*** 2819,2827 
  getcwd()  The result is a String, which is the name of the current
working directory.
  
! getfsize({fname}) *getfsize()*
The result is a Number, which is the size in bytes of the
given file {fname}.
If {fname} is a directory, 0 is returned.
If the file {fname} can't be found, -1 is returned.
  
--- 2819,2829 
  getcwd()  The result is a String, which is the name of the current
working directory.
  
! getfsize({fname} [,unitsize]) *getfsize()*
The result is a Number, which is the size in bytes of the
given file {fname}.
+   If unitsize is given, then the file {fname}'s size will be
+