Re: Method needed for skipping lines

2007-11-01 Thread Gustaf
Bruno Desthuilliers wrote:

> Here's a straightforward solution:



Thank you. I learned several things from that. :-)

Gustaf
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Method needed for skipping lines

2007-11-01 Thread Gustaf
Yu-Xi Lim wrote:

> David Mertz's Text Processing in Python might give you some more 
> efficient (and interesting) ways of approaching the problem.
> 
> http://gnosis.cx/TPiP/

Thank you for the link. Looks like a great resource.

Gustaf
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Method needed for skipping lines

2007-10-31 Thread Anand
On Nov 1, 5:04 am, Paul Hankin <[EMAIL PROTECTED]> wrote:
> On Oct 31, 5:02 pm, Gustaf <[EMAIL PROTECTED]> wrote:
>
> > Hi all,
>
> > Just for fun, I'm working on a script to count the number of lines in 
> > source files. Some lines are auto-generated (by the IDE) and shouldn't be 
> > counted. The auto-generated part of files start with "Begin VB.Form" and 
> > end with "End" (first thing on the line). The "End" keyword may appear 
> > inside the auto-generated part, but not at the beginning of the line.

I think we can take help of regular expressions.

import re

rx = re.compile('^Begin VB.Form.*^End\n', re.DOTALL|re.MULTILINE)

def count(filename)
text = open(filename).read()
return rx.sub('', text).count('\n')

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Method needed for skipping lines

2007-10-31 Thread Paul Hankin
On Oct 31, 5:02 pm, Gustaf <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> Just for fun, I'm working on a script to count the number of lines in source 
> files. Some lines are auto-generated (by the IDE) and shouldn't be counted. 
> The auto-generated part of files start with "Begin VB.Form" and end with 
> "End" (first thing on the line). The "End" keyword may appear inside the 
> auto-generated part, but not at the beginning of the line.
>
> I imagine having a flag variable to tell whether you're inside the 
> auto-generated part, but I wasn't able to figure out exactly how. Here's the 
> function, without the ability to skip auto-generated code:
>
> # Count the lines of source code in the file
> def count_lines(f):
>   file = open(f, 'r')
>   rows = 0
>   for line in file:
> rows = rows + 1
>   return rows
>
> How would you modify this to exclude lines between "Begin VB.Form" and "End" 
> as described above?

First, your function can be written much more compactly:
def count_lines(f):
return len(open(f, 'r'))


Anyway, to answer your question, write a function that omits the lines
you want excluded:

def omit_generated_lines(lines):
in_generated = False
for line in lines:
line = line.strip()
in_generated = in_generated or line.starts_with('Begin
VB.Form')
if not in_generated:
 yield line
in_generated = in_generated and not line.starts_with('End')

And count the remaining ones...

def count_lines(filename):
return len(omit_generated_lines(open(filename, 'r')))

--
Paul Hankin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Method needed for skipping lines

2007-10-31 Thread Bruno Desthuilliers
Gustaf a écrit :
> Hi all,
> 
> Just for fun, I'm working on a script to count the number of lines in 
> source files. Some lines are auto-generated (by the IDE) and shouldn't 
> be counted. The auto-generated part of files start with "Begin VB.Form" 
> and end with "End" (first thing on the line). The "End" keyword may 
> appear inside the auto-generated part, but not at the beginning of the 
> line.
> 
> I imagine having a flag variable to tell whether you're inside the 
> auto-generated part, but I wasn't able to figure out exactly how. Here's 
> the function, without the ability to skip auto-generated code:
> 
> # Count the lines of source code in the file
> def count_lines(f):
>  file = open(f, 'r')

1/ The param name is not very explicit.
2/ You're shadowing the builtin file type.
3/ It migh be better to pass an opened file object instead - this would 
make your function more generic (ok, perhaps a bit overkill here, but 
still a better practice IMHO).

>  rows = 0

Shouldn't that be something like 'line_count' ?

>  for line in file:
>rows = rows + 1

Use augmented assignment instead:
  rows += 1

>  return rows

You forgot to close the file.

> How would you modify this to exclude lines between "Begin VB.Form" and 
> "End" as described above?

Here's a straightforward solution:

def count_loc(path):
   loc_count = 0
   in_form = False
   opened_file = open(path)
   try:
 # striping lines, and skipping blank lines
 for line in opened_file:
   line = line.strip()
   # skipping blank lines
   if not line:
 continue
   # skipping VB comments
   # XXX: comment mark should not be hardcoded
   if line.startswith(';'):
 continue
   # skipping autogenerated code
   if line.startswith("Begin VB.Form"):
 in_form = True
 continue
   elif in_form:
 if line.startswith("End"):
 in_form = False
 continue
   # Still here ? ok, we count this one
   loc_count += 1
   finally:
 opened_file.close()
   return loc_count

HTH

PS : If you prefer a more functional approach
(warning: the following code may permanently damage innocent minds):

def chain(*predicates):
 def _chained(arg):
 for p in predicates:
 if not p(arg):
 return False
 return True
 return _chained

def not_(predicate):
 def _not_(arg):
 return not predicate(arg)
 return _not_

class InGroupPredicate(object):
 def __init__(self, begin_group, end_group):
 self.in_group = False
 self.begin_group = begin_group
 self.end_group = end_group

 def __call__(self, line):
 if self.begin_group(line):
 self.in_group = True
 return True
 elif self.in_group and self.end_group(line):
 self.in_group = False
 return True # this one too is part of the group
 return self.in_group

def count_locs(lines, count_line):
 return len(filter(
chain(lambda line: bool(line), count_line),
map(str.strip,lines)
))

def count_vb_locs(lines):
 return count_locs(lines, chain(
 not_(InGroupPredicate(
 lambda line: line.startswith('Begin VB.Form'),
 lambda line: line.startswith('End')
 )),
 lambda line: not line.startswith(';')
   ))

# and finally our count_lines function, greatly simplified !-)
def count_lines(path):
 f = open(path)
 try:
 return count_vb_locs(f)
 finally:
 f.close()

(anyone on doing it with itertools ?-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Method needed for skipping lines

2007-10-31 Thread Yu-Xi Lim
Gustaf wrote:
> Hi all,
> 
> Just for fun, I'm working on a script to count the number of lines in 
> source files. Some lines are auto-generated (by the IDE) and shouldn't 
> be counted. The auto-generated part of files start with "Begin VB.Form" 
> and end with "End" (first thing on the line). The "End" keyword may 
> appear inside the auto-generated part, but not at the beginning of the 
> line.
> 
> I imagine having a flag variable to tell whether you're inside the 
> auto-generated part, but I wasn't able to figure out exactly how. Here's 
> the function, without the ability to skip auto-generated code:
> 
> # Count the lines of source code in the file
> def count_lines(f):
>  file = open(f, 'r')
>  rows = 0
>  for line in file:
>rows = rows + 1
>  return rows
> 
> How would you modify this to exclude lines between "Begin VB.Form" and 
> "End" as described above?
> Gustaf

David Mertz's Text Processing in Python might give you some more 
efficient (and interesting) ways of approaching the problem.

http://gnosis.cx/TPiP/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Method needed for skipping lines

2007-10-31 Thread Marc 'BlackJack' Rintsch
On Wed, 31 Oct 2007 18:02:26 +0100, Gustaf wrote:

> Just for fun, I'm working on a script to count the number of lines in
> source files. Some lines are auto-generated (by the IDE) and shouldn't be
> counted. The auto-generated part of files start with "Begin VB.Form" and
> end with "End" (first thing on the line). The "End" keyword may appear
> inside the auto-generated part, but not at the beginning of the line.
> 
> I imagine having a flag variable to tell whether you're inside the
> auto-generated part, but I wasn't able to figure out exactly how. Here's
> the function, without the ability to skip auto-generated code:
> 
> # Count the lines of source code in the file def count_lines(f):
>   file = open(f, 'r')
>   rows = 0
>   for line in file:
> rows = rows + 1
>   return rows
> 
> How would you modify this to exclude lines between "Begin VB.Form" and
> "End" as described above? 

Introduce the flag and look up the docs for the `startswith()` method on
strings.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list