Re: Is there a better/simpler way to filter blank lines?
On Wed, 05 Nov 2008 14:39:36 +1100, Ben Finney wrote: Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote: Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: Your example shows only that they're important for grouping the expression from surrounding syntax. As I said. They are *not* important for making the expresison be a generator expression in the first place. Parentheses are irrelevant for the generator expression syntax. Okay, technically correct but parenthesis belong to generator expressions because they have to be there to separate them from surrounding syntax with the exception when there are already enclosing parentheses. So parenthesis are tied to generator expression syntax. No, I think that's factually wrong *and* confusing. list(i + 7 for i in range(10)) [7, 8, 9, 10, 11, 12, 13, 14, 15, 16] Does this demonstrate that parentheses are “tied to” integer literal syntax? No. You can use integer literals without parenthesis, like the 7 above, but you can't use generator expressions without them. They are always there. In that way parenthesis are tied to generator expressions. If I see the pattern ``f(x) for x in obj if c(x)`` I look if it is enclosed in parenthesis or brackets to decide if it is a list comprehension or a generator expression. That may not reflect the formal grammar, but it is IMHO the easiest and pragmatic way to look at this as a human programmer. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Tue, 04 Nov 2008 15:36:23 -0600, Larry Bates [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) ... Of if you want to filter/loop at the same time, or if you don't want all the lines in memory at the same time: Or if you want to support potentially infinite input streams, such as a pipe or socket. There are many reasons this is my preferred way of going through a text file. fp = open(filename, 'r') for line in fp: if not line.strip(): continue # # Do something with the non-blank like: # fp.close() Often, you want to at least rstrip() all lines anyway, for other reasons, and then the extra cost is even less: line = line.rstrip() if not line: continue # do something with the rstripped, nonblank lines /Jorgen -- // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu \X/ snipabacken.se R'lyeh wgah'nagl fhtagn! -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Why do I feel like the coding style in Lutz' Programming Python is very far from idiomatic Python? The content feels dated, and I find that most answers that I get for Python questions use a different style from the sort of code I see in this book. Thomas On Nov 5, 7:15 am, Jorgen Grahn [EMAIL PROTECTED] wrote: On Tue, 04 Nov 2008 15:36:23 -0600, Larry Bates [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) ... Of if you want to filter/loop at the same time, or if you don't want all the lines in memory at the same time: Or if you want to support potentially infinite input streams, such as a pipe or socket. There are many reasons this is my preferred way of going through a text file. fp = open(filename, 'r') for line in fp: if not line.strip(): continue # # Do something with the non-blank like: # fp.close() Often, you want to at least rstrip() all lines anyway, for other reasons, and then the extra cost is even less: line = line.rstrip() if not line: continue # do something with the rstripped, nonblank lines /Jorgen -- // Jorgen Grahn grahn@ Ph'nglui mglw'nafh Cthulhu \X/ snipabacken.se R'lyeh wgah'nagl fhtagn! -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Nov 5, 4:56 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: On Wed, 05 Nov 2008 14:39:36 +1100, Ben Finney wrote: Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote: Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: Your example shows only that they're important for grouping the expression from surrounding syntax. As I said. They are *not* important for making the expresison be a generator expression in the first place. Parentheses are irrelevant for the generator expression syntax. Okay, technically correct but parenthesis belong to generator expressions because they have to be there to separate them from surrounding syntax with the exception when there are already enclosing parentheses. So parenthesis are tied to generator expression syntax. No, I think that's factually wrong *and* confusing. list(i + 7 for i in range(10)) [7, 8, 9, 10, 11, 12, 13, 14, 15, 16] Does this demonstrate that parentheses are “tied to” integer literal syntax? No. You can use integer literals without parenthesis, like the 7 above, but you can't use generator expressions without them. They are always there. In that way parenthesis are tied to generator expressions. If I see the pattern ``f(x) for x in obj if c(x)`` I look if it is enclosed in parenthesis or brackets to decide if it is a list comprehension or a generator expression. That may not reflect the formal grammar, but it is IMHO the easiest and pragmatic way to look at this as a human programmer. Ciao, Marc 'BlackJack' Rintsch The situation is similar to tuples. What makes a tuple is the commas, not the parens. What makes a generator expression is exp for var-or-tuple in exp. Parenthesis is generally required because without it, it's almost impossible to differentiate it with the surrounding. But it is not part of the formally required syntax. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Lie [EMAIL PROTECTED] writes: What makes a generator expression is exp for var-or-tuple in exp. Parenthesis is generally required because without it, it's almost impossible to differentiate it with the surrounding. But it is not part of the formally required syntax. ... But *every* generator expression is surrounded by parentheses, isn't it? -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Wed, 05 Nov 2008 21:23:57 +, Arnaud Delobelle wrote: Lie [EMAIL PROTECTED] writes: What makes a generator expression is exp for var-or-tuple in exp. Parenthesis is generally required because without it, it's almost impossible to differentiate it with the surrounding. But it is not part of the formally required syntax. ... But *every* generator expression is surrounded by parentheses, isn't it? Yes, but sometimes they are there in order to call a function, not to form the generator expression. I'm surprised that nobody yet has RTFM: http://docs.python.org/reference/expressions.html [quote] A generator expression is a compact generator notation in parentheses: generator_expression ::= ( expression genexpr_for ) genexpr_for ::= for target_list in or_test [genexpr_iter] genexpr_iter ::= genexpr_for | genexpr_if genexpr_if ::= if old_expression [genexpr_iter] ... The parentheses can be omitted on calls with only one argument. [end quote] It seems to me that the FM says that the parentheses *are* part of the syntax for a generator expression, but if some other syntactic construct (e.g. a function call) provides the parentheses, then you don't need to supply a second, redundant, pair. I believe that this is the definitive answer, short of somebody reading the source code and claiming the documentation is wrong. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Arnaud Delobelle wrote: Lie [EMAIL PROTECTED] writes: What makes a generator expression is exp for var-or-tuple in exp. Parenthesis is generally required because without it, it's almost impossible to differentiate it with the surrounding. But it is not part of the formally required syntax. ... But *every* generator expression is surrounded by parentheses, isn't it? Indeed, the syntax production is: generator_expression ::= ( expression genexpr_for ) albeit with the note: The parentheses can be omitted on calls with only one argument. See section 5.3.4 for the detail. but that only means you don't need a second set of parentheses. A generator expression is always enclosed in parentheses, the same is NOT true of a tuple. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Ben Finney wrote: Falcolas writes: Using the surrounding parentheses creates a generator object No. Using the generator expression syntax creates a generator object. Parentheses are irrelevant to whether the expression is a generator expression. The parentheses merely group the expression from surrounding syntax. As others have pointed out, the parentheses are part of the generator syntax. If not for the parentheses, a list comprehension would be indistinguishable from a list literal with a single element, a generator object. It's also worth remembering that list comprehensions are distinct from generator expressions and don't require the creation of a generator object. -Miles -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Steven D'Aprano [EMAIL PROTECTED] writes: I'm surprised that nobody yet has RTFM: http://docs.python.org/reference/expressions.html generator_expression ::= ( expression genexpr_for ) ... The parentheses can be omitted on calls with only one argument. It's a fair cop. Thanks for setting me straight. -- \ “We can't depend for the long run on distinguishing one | `\ bitstream from another in order to figure out which rules | _o__) apply.” —Eben Moglen, _Anarchism Triumphant_, 1999 | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
[EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile Of if you want to filter/loop at the same time, or if you don't want all the lines in memory at the same time: fp = open(filename, 'r') for line in fp: if not line.strip(): continue # # Do something with the non-blank like: # fp.close() -Larry -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Nov 4, 4:30 pm, [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 xlines.sort() Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'generator' object has no attribute 'sort' What do you think? Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Tue, 04 Nov 2008 13:27:00 -0800, tmallen wrote: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) Thomas lines = filter(lambda line: line.strip(), lines) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Tue, Nov 4, 2008 at 2:30 PM, tmallen [EMAIL PROTECTED] wrote: On Nov 4, 4:30 pm, [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 xlines.sort() Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'generator' object has no attribute 'sort' What do you think? xlines is a generator, not a list. If you don't know what a generator is, see the relevant parts of the Python tutorial/manual (Google is your friend). To sort the generator, you can use 'sorted(xlines)' If you need it to actually be a list, you can do 'list(xlines)' Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com Thomas -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Larry Bates [EMAIL PROTECTED] writes: [EMAIL PROTECTED] wrote: xlines = (line for line in open(filename) if line.strip()) Of if you want to filter/loop at the same time, or if you don't want all the lines in memory at the same time The above implementation creates a generator; so it, too, won't need to load all the lines in memory at the same time -- \“Program testing can be a very effective way to show the | `\presence of bugs, but is hopelessly inadequate for showing | _o__) their absence.” —Edsger W. Dijkstra | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
tmallen [EMAIL PROTECTED] writes: On Nov 4, 4:30 pm, [EMAIL PROTECTED] wrote: xlines = (line for line in open(filename) if line.strip()) I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 A generator URL:http://www.python.org/dev/peps/pep-0255 is a sequence, but is not a collection. It will generate each item on request, rather than having them all in memory at once. for line in xlines: do something_knowing_the_line_is_not_blank(line) If you later *want* a collection containing all the items from the generator, you can feed the generator (or any iterable) to a type that can turn it into a collection. For example, to get all the filtered lines as a list: all_lines = list(xlines) Note that some generators (not this one, which will end because the file is finite size) never end, so feeding them to a constructor this way will never return. -- \ “It is far better to grasp the universe as it really is than to | `\persist in delusion, however satisfying and reassuring.” —Carl | _o__)Sagan | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
tmallen I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 xlines.sort() Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'generator' object has no attribute 'sort' What do you think? Congratulations, you have just met your first lazy construct ^_^ That's a generator, it yields nonblank lines one after the other. This can be really useful. If you want a real array of items, then you can do this: lines = list(xlines) Or use a list comp.: lines = [line for line in open(new.data) if line.strip()] Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Nov 4, 3:30 pm, tmallen [EMAIL PROTECTED] wrote: On Nov 4, 4:30 pm, [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 xlines.sort() Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'generator' object has no attribute 'sort' What do you think? Thomas Using the surrounding parentheses creates a generator object, whereas using square brackets would create a list. So, if you want to run list operations on the resulting object, you'll want to use the list comprehension instead. i.e. list_o_lines = [line for line in open(filename) if line.strip()] Downside is the increased memory usage and processing time as you dump the entire file into memory, whereas if you plan to do a for line in xlines: operation, it would be faster to use the generator. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Between this info and http://www.python.org/doc/2.5.2/tut/node11.html#SECTION0011100 , I'm starting to understand how I'll use generators (I've seen them mentioned before, but never used them knowingly). list_o_lines = [line for line in open(filename) if line.strip()] +1 for list_o_lines Thanks for the help! Thomas On Nov 4, 6:36 pm, Falcolas [EMAIL PROTECTED] wrote: On Nov 4, 3:30 pm, tmallen [EMAIL PROTECTED] wrote: On Nov 4, 4:30 pm, [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 xlines.sort() Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'generator' object has no attribute 'sort' What do you think? Thomas Using the surrounding parentheses creates a generator object, whereas using square brackets would create a list. So, if you want to run list operations on the resulting object, you'll want to use the list comprehension instead. i.e. list_o_lines = [line for line in open(filename) if line.strip()] Downside is the increased memory usage and processing time as you dump the entire file into memory, whereas if you plan to do a for line in xlines: operation, it would be faster to use the generator. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Falcolas [EMAIL PROTECTED] writes: Using the surrounding parentheses creates a generator object No. Using the generator expression syntax creates a generator object. Parentheses are irrelevant to whether the expression is a generator expression. The parentheses merely group the expression from surrounding syntax. -- \ “bash awk grep perl sed, df du, du-du du-du, vi troff su fsck | `\ rm * halt LART LART LART!” —The Swedish BOFH, | _o__)alt.sysadmin.recovery | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
tmallen wrote: On Nov 4, 4:30 pm, [EMAIL PROTECTED] wrote: tmallen: I'm parsing some text files, and I want to strip blank lines in the process. Is there a simpler way to do this than what I have here? lines = filter(lambda line: len(line.strip()) 0, lines) xlines = (line for line in open(filename) if line.strip()) Bye, bearophile I must be missing something: xlines = (line for line in open(new.data) if line.strip()) xlines generator object at 0x6b648 xlines.sort() Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'generator' object has no attribute 'sort' What do you think? I think there'd be no advantage to a sort method on a generator, since theoretically the last item could be the first required in the sorted sequence, so it's necessary to hold all items in memory to ensure the sort is correct. So there's no point using a generator in the first place. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Wed, 05 Nov 2008 12:06:42 +1100, Ben Finney wrote: Falcolas [EMAIL PROTECTED] writes: Using the surrounding parentheses creates a generator object No. Using the generator expression syntax creates a generator object. Parentheses are irrelevant to whether the expression is a generator expression. The parentheses merely group the expression from surrounding syntax. No they are important: In [270]: a = x for x in xrange(10) File ipython console, line 1 a = x for x in xrange(10) ^ type 'exceptions.SyntaxError': invalid syntax In [271]: a = (x for x in xrange(10)) Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: On Wed, 05 Nov 2008 12:06:42 +1100, Ben Finney wrote: Falcolas [EMAIL PROTECTED] writes: Using the surrounding parentheses creates a generator object No. Using the generator expression syntax creates a generator object. Parentheses are irrelevant to whether the expression is a generator expression. The parentheses merely group the expression from surrounding syntax. No they are important: Your example shows only that they're important for grouping the expression from surrounding syntax. As I said. They are *not* important for making the expresison be a generator expression in the first place. Parentheses are irrelevant for the generator expression syntax. -- \ “Today, I was — no, that wasn't me.” —Steven Wright | `\ | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Steve Holden [EMAIL PROTECTED] writes: I think there'd be no advantage to a sort method on a generator, since theoretically the last item could be the first required in the sorted sequence Worse, generators don't necessarily *have* a finite set of items, and there's no way in general of telling whether any particular generator will have a “last item” without trying to get all the items. So it would be actively harmful to provide such a method on generators, IMO. -- \ “Whatever you do will be insignificant, but it is very | `\important that you do it.” —Mahatma Gandhi | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Tue, 04 Nov 2008 20:25:09 -0500, Steve Holden wrote: I think there'd be no advantage to a sort method on a generator, since theoretically the last item could be the first required in the sorted sequence, so it's necessary to hold all items in memory to ensure the sort is correct. So there's no point using a generator in the first place. You can't sort something lazily. Actually, that's not *quite* true: it only holds for comparison sorts. You can sort lazily using non-comparison sorts, such as Counting Sort: http://en.wikipedia.org/wiki/Counting_sort Arguably, the benefit of giving generators a sort() method would be to avoid an explicit call to list. But I think many people would argue that was actually a disadvantage, not a benefit, and that the call to list is a good thing. I'd agree with them. However, sorted() should take a generator argument, and in fact I see it does: sorted( x+1 for x in (4, 2, 0, 3, 1) ) [1, 2, 3, 4, 5] -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote: Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: Your example shows only that they're important for grouping the expression from surrounding syntax. As I said. They are *not* important for making the expresison be a generator expression in the first place. Parentheses are irrelevant for the generator expression syntax. Okay, technically correct but parenthesis belong to generator expressions because they have to be there to separate them from surrounding syntax with the exception when there are already enclosing parentheses. So parenthesis are tied to generator expression syntax. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: Is there a better/simpler way to filter blank lines?
Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: On Wed, 05 Nov 2008 13:18:27 +1100, Ben Finney wrote: Marc 'BlackJack' Rintsch [EMAIL PROTECTED] writes: Your example shows only that they're important for grouping the expression from surrounding syntax. As I said. They are *not* important for making the expresison be a generator expression in the first place. Parentheses are irrelevant for the generator expression syntax. Okay, technically correct but parenthesis belong to generator expressions because they have to be there to separate them from surrounding syntax with the exception when there are already enclosing parentheses. So parenthesis are tied to generator expression syntax. No, I think that's factually wrong *and* confusing. list(i + 7 for i in range(10)) [7, 8, 9, 10, 11, 12, 13, 14, 15, 16] Does this demonstrate that parentheses are “tied to” integer literal syntax? No. Here, parentheses were used because they're part of the function call syntax. In your example, parentheses were used as a grouping operator. In neither case are they “tied to” the generator expression syntax. It's best to be clear what parentheses *are* used for; they don't “create a generator” nor are they “tied to” the generator expression syntax. -- \ “In any great organization it is far, far safer to be wrong | `\ with the majority than to be right alone.” —John Kenneth | _o__)Galbraith, 1989-07-28 | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list