Re: string interpolation for python
>>> I can't find a way to use an argument more than once, >>> without switching to "dictionary mode" and using keys for >>> everything. Even in "dictionary mode", the key is spelled more than once. The "tuple mode" below seems to save some typing. However, when there are more and more items to format, the readability deteriorates quickly in the "tuple" mode. Use an argument more than once is, after all, probably a not-so-often use case. The "dictionary mode" greatly enhances readability, but you have to provide a dict object and make sure the name in your formatting string matches the keys in the dictionary. And this matching requirement is because of information redundancy in the arrangement. And information redundancy is often the root of evil for various kinds of trouble. Dynamic string seems to have the good without redundancy. And there is no need to build a dict for it. However, when there is already the dict for use, clearly the dict format is the winner. >> Ack. >> >> In this case, you can use format: >> > "Hello {0}, what's up with {1}? Hey, {0} I'm > speaking to you!".format >> ('George', 'Melissa') >> "Hello George, what's up with Melissa? Hey, George I'm > speaking to you!" > > Or don't bother with the initial string, and simply pass everything as > arguments: > > def yingjie_format(*args): > it=iter(args) > return ''.join(s%next(it,None) for s in it) > yingjie_format("sin(%s",x,") = %0.3f", sin(x)) > That's very nice, thanks! > So if they're exactly like normal expressions, why not simply use > normal expressions? I think use dynamic strings can have these benefits: 1) you less keys (punch less keys). 2) better readability (less clutters) 3) you don't have to explicit convert/format expressions into strings 4) better performance too (adding strings together is generally slow). >>> sprintf("UPDATE tablename SET modified=now()%{,%s=:%[0]s%} WHERE >>> key=%d",array_of_field_names,primary_key_value) >>> --> "UPDATE tablename SET > modified=now(),foo=:foo,bar=:bar,quux=:quux >>> WHERE key=1234" >>> >>> You're still paying for no complexity you aren't actually > using. >>> It's clear and readable. >> >> You are really good at that. Maybe not everybody is as >> experience as you, and I suppose the learning curve is >> kind of hard to climb. > > Yes, it takes some learning to use it. But that's true of everything, > no less of your magical string interpolation. My point is that simple > examples should be (and are, with printf formatting) simple, such that > you only get those more complicated format strings when you're > actually doing a complicated job (in that case, taking each element of > an array and using it twice - actually, it was taking the indices of a > mapping that would end up being passed to the DB query function, thus > providing values to the :foo :bar variables). > Sure. But if one thing does well on both simple and complex situations, why not that thing? >> Those expressions are embedded, you don't need eval() >> to have the result though. Are we on the same page? > > I can see three implementation paths: > > 1) Language feature. It really *is* just an expression. There's no way > that a user can provide them, so there's actually no similarity to > eval. But this requires that Python itself handle things. > > 2) Precompiler. It *becomes* an expression. Again, perfectly safe, > although I don't know how useful this really is. > > 3) Functoin. As several have suggested, you could do some magic and > use d("this is a $dollar$ $interpolated$ string") to implement. For > this, you *will* need eval (or something like it). > Sure. for 1), things can be done most conveniently and efficiently. for 2), yeah, not sure how useful it is. for 3), maybe can let str class have a property like: dy. which can do all the dirty processing. Then we may do: >>> x = 0 >>> "sin($x$) = $sin(x)$".dy 'sin(0) = 0.0' Not much burden to use except for the CPU, I suppose. >> You mean a translator? > > Yes. It translates your dollar-strings into something that's legal > Python 3.3 syntax - either calls to a function like I provided above, > or actual embedded expressions. > >> The syntax is essential for compatibility. >> We must distinguish dynamic strings from common strings. >> They will live peacefully together. >> (escaping the '$' in normal strings breaks compatibility, >> and the consequence of forgetting to escape could be >> disastrous, so definitely not an option). >> >> May be d" is too tiny, $"..." is easier to pick out. > > I don't like the use of symbols like that; can someone, glancing at > your code, tell whether $ is an operator, a name, or something else? > The original d is probably better for that. > Yeah, the d is probably better. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
>> In that case you should re-think the delimiters, so that you have >> something >> that can be nested. An example (example only, I'm not in love with it > as >> a final form): Haven't really thought about that. Nesting is a big issue if in the embedded expression, there is another dynamic string expression. For example (the first '%' immediately preceding the first quote denotes the start of a dynamic string expression):: >>> first_name, family_name = "Peter", "Johns" >>> %"Hi $ first_name+%", $family_name$" $!" 'Hi Peter, Johns!' However, nesting is really cluttering things up and we should forbid it without loosing power: The above example can be done without nesting: >>> >>> %"Hi $ first_name$, $family_name$!" 'Hi Peter, Johns!' Can you offer an example where nesting is better or necessary? Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> Right, meaning that both have the same issues > of performance, need for > str(), etc. There's absolutely no difference. OK, performance. Here is a new solution: Suppose we have a new string method str.format_join([...]) taking a list of strings and objects, with even-indexed ones being strings, odd-indexed ones being objects. Each even-indexed string *ends* with a formatting specification for the next object in the list. Then we can have: >>> d"sin($x$) = $ sin(x):0.3f $" get translated to: >>> ''.format_join(["sin(%s",x,") = %0.3f", sin(x)]) This seems to be at least as good in performance. > And no benefit. You lose out on syntax highlighting > in your editor and gain nothing. Gain: readability, terseness, hassle-free, and possibly better performance if done right. Syntax highlighting: can be done more creatively. For dynamic strings, string parts are like normal strings, but the embedded expressions are like normal expressions :) > > sprintf("UPDATE tablename SET modified=now()%{,%s=:%[0]s%} WHERE > key=%d",array_of_field_names,primary_key_value) > --> "UPDATE tablename SET modified=now(),foo=:foo,bar=:bar,quux=:quux > WHERE key=1234" > > You're still paying for no complexity you aren't actually using. > It's clear and readable. You are really good at that. Maybe not everybody is as experience as you, and I suppose the learning curve is kind of hard to climb. > It's powerful only if you use eval to allow full expression syntax. > Otherwise, what does it have above str.format()? Those expressions are embedded, you don't need eval() to have the result though. Are we on the same page? > You may well be able to get past the compatibility issues. I'm not yet > convinced that the new syntax is worth it, but it may be possible. > > Here's a recommendation: Write a parser for your notation that turns > it into executable Python code (that is, executable in Python 3.3 > without any d"..." support). You mean a translator? The syntax is essential for compatibility. We must distinguish dynamic strings from common strings. They will live peacefully together. (escaping the '$' in normal strings breaks compatibility, and the consequence of forgetting to escape could be disastrous, so definitely not an option). May be d" is too tiny, $"..." is easier to pick out. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> like this one ? > > b = dict(name="Sue", job="SAS sharp-shooter") > print "$b['name']$ works as b['job']" > > Is it really easier to read that the following ? > "{0} works as {1}".format(b['name'],b['job']) > > In the case in which b is an object having "job" and "name" > attribute, the dynamic string will write > > "$b.name$ works as $b.job$" > instead of > "{0}.name works as {0}.job".format(b) > When you already have a dict, the dict-based formatting would be nice. >>> "%(name)s works as %(job)s"%b If it you need to create a dict just for string formatting, dynamic string would be nice. Say your object has methods/properties that fetch things from your database. >>> class staff: ...@property ...def name(): return 'Peter' ... >>> t = staff() >>> vars(t) {} >>> t.name 'Peter' >>> d"Staff name: $t.name$" #note the d"..." format 'Staff name: Peter' Because of the d"..." format, it won't affect old ways of doing things one bit. Allowing dynamic string wouldn't hurt a bit to anything that is already there. -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
... > That, by the way, is perhaps the biggest problem with this idea of > dynamic strings: not that it is too powerful, but that it is TOO WEAK. ... > and similar for both format() and Template. Seems you miss understood my notion of dynamic string. Dynamic strings are expressions in disguise: the things in between $...$ are plain old expressions (with optional formatting specifications). They are evaluated as if they were outside the dynamic string. We put them in there to to kill two birds with one stone: 1) ease of reading; 2) place holding. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> "Are you %(name)s" % locals() # or vars() This partly solves the problem, however, you can't work with expressions inside, like: > d"sin($x$) = $sin(x)$" Also, what if locals() or vars() does not contain the variable "x"? (x could be nonlocal or global). > It's more conservative than hostile. Here's some insight: > http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html > You are probably right...Harmless enhancement is still probable? > > Personally, in isolation, the only part of your proposal I find > /truly/ objectionable is the support for arbitrary expressions, since > it would tend towards encouraging suboptimal factoring. But we also I don't quite see that as a problem. The compiler (or translator, as you mentioned earlier) could easily make d"sin($x$) = $sin(x)$" into something like: ''.join(["sin(", str(x), ") = ", str(sin(x))] which would far more efficient than calling the format() method. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> "Are you "+name+"?" > > That allows arbitrary expressions and everything. > To make that work for any type, you need: >>> "Are you "+ str(name) + "?" Another concern is performance. You are absolutely right, they are equivalent in that both are expressions. As long as people start to realize that dynamic strings are expressions, there is no magic in it any more. And allowing expressions in those dynamic strings would make sense since they are of the same sort. >>> d"sin($x$) = $ sin(x):0.3f $" is equivalent to the expression of >>> "sin(%s"%x + ")= %0.3f"%sin(x) Comparing th e two, I would say the latter is more computer friendly while the former, more human friendly. If the computed result is only to be used in formatting the string, it would be nice to save an assignment stmt. >> >> Almost as terse, but not as readable, especially >> when there are many parts to substitute -- >> the coder and reader need to be careful >> to make sure the sequence is correct. > > I quite like this notation, personally. It's convenient, and is > supported (with variants) in quite a few C-derived languages (and, in > spite of the massive syntactic differences, Python does have C > heritage). Sure, once you get used to it, it would be harder to stop it the harder it is :). That's part of human nature, anyway. >> Why the Python community is so >> hostile to new things now? >> Python has merits, >> but it is far from being perfect. > > Hey now, no need to get defensive :) Thing is, it's up to you to > demonstrate that your proposal justifies itself. You're proposing to > create a massive backward-compatibility issue, so you need to prove > that your new way of formatting strings is sufficiently awesome to be > able to say "Well, you need Python 3.4+ to use this". > OK. I have put it out as is. I trust people knows good things. I would simply say: this new way is much more simple and much more powerful. And there is no security issues as long as you don't use the evil eval to evaluate expressions, which is always a security issue. It is new, and has no compatibility issues with old ways at all. In syntax, all you need is to allow d"...", which clearly won't affect any old ways of business. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> Actually, this sounds like a job for a precompiler/preprocessor. Do > whatever translations you want on your code, then turn it into a .py > file for execution. But hardly necessary, as there are already two - > err, three, I stand corrected - perfectly good ways to do it. Agree and disagree. The other ways are not perfectly good. They stinks in Python. This new way is the most natural way. Effortless, natural. That's my Python. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
- Original Message - > From: Steven D'Aprano > To: python-list@python.org > Cc: > Sent: Monday, April 2, 2012 4:26 PM > Subject: Re: string interpolation for python > > On Mon, 02 Apr 2012 00:39:42 -0700, Yingjie Lan wrote: > >>> You can already do essentially that without adding a special-case >>> string >> >>> formatting method to the general methods we already have. >>> >>>>>> balls = 5 >>>>>> people = 3 >>>>>> 'The {people} people have {balls} >>> balls.'.format(**locals()) >>> 'The 3 people have 5 balls.' >> >> >> Clearly dynamic strings are much more powerful, allowing arbitrary >> expressions inside. > > And so it may be a security risk, if user-input somehow ends up treated > as a dynamic string. > > We already have three ways to evaluate arbitrary expressions: > > * Python code > * eval > * exec > > Why do we need yet another one? > > >> It is also more terse and readable, since we need no >> dictionary. > > I think you mean terse and unreadable, since we need no dictionary. That > means that variables will be evaluated by magic from... where? Globals? > Local scope? Non-local scope? All of the above? > > We already have one way of evaluating implicit variables using implicit > rules, namely regular Python code. Why do we need a second one? > > >> I would probably rather liken dynamic expressions as a little brother of >> computable documents in Mathematica. It is a new kind of expression, >> rather than formatting -- though it has formatting connections. > > Why do we need a new kind of expression? > > >> Dynamic strings are mainly useful at time of writing readable code >> before compilation. > > What does that mean? > > >> The compiler can choose to convert it into a string >> formatting expression, of course. To efficiently format strings at >> runtime, the best choice (especially >> for safty reasons) is string formatting, not evaluating a dynamic >> string. > > So you're suggesting that we should have dynamic strings, but not > actually use dynamic strings. The compiler should just convert them to > regular string formatting. > > Why not cut out the middle-man and just use regular string formatting? > I believe non of the other three alternatives are as terse and readable. We've got template based, formatting with dict, formatting with tuple. They all require the coder extra effort: Both template based and dict-based formatting require writing the identifier three times: >>> name = 'Peter' >>> "Are you %(name)s"%{'name':name} If dynamic string is used: >>> "Are you $name$?" Template: >>> Template("Are you $name?").substitute(name=name) It is three to one in compactness, what a magic 3! Of course, the old C style way: >>> "Are you %s?"%name Almost as terse, but not as readable, especially when there are many parts to substitute -- the coder and reader need to be careful to make sure the sequence is correct. Why the Python community is so hostile to new things now? Python has merits, but it is far from being perfect. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> Python already has *3* different built-in string > formatting/interpolation systems: ... > I would surmise that your key "implicitly grab variable values from > the enclosing scope" feature has previously been rejected for being > too magical. It grabs because it is an expression in disguise (not a literal). If we could make that clear in the way we write it, seems the magic level should reduce.to a tolerable (or even likable) level. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
> You can already do essentially that without adding a special-case string > formatting method to the general methods we already have. > balls = 5 people = 3 'The {people} people have {balls} > balls.'.format(**locals()) > 'The 3 people have 5 balls.' Clearly dynamic strings are much more powerful, allowing arbitrary expressions inside. It is also more terse and readable, since we need no dictionary. I would probably rather liken dynamic expressions as a little brother of computable documents in Mathematica. It is a new kind of expression, rather than formatting -- though it has formatting connections. Dynamic strings are mainly useful at time of writing readable code before compilation. The compiler can choose to convert it into a string formatting expression, of course. To efficiently format strings at runtime, the best choice (especially for safty reasons) is string formatting, not evaluating a dynamic string. On the implementation, I would suppose new syntax is needed (though very small). Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: string interpolation for python
Hi Adrian, see my comments below. > > From: Adrian Hunt ... >It could break old code... okay you may say you should’nt allow >certain characters but if they're printable and used in a controlled >environment those characters can dramatically increase the security >of a username and password. What you said makes lots of sense to me. if strings are interpolated *automatically*. But it won't and shouldn't. They are called "Dynamic strings". Dynamic strings can achieve formatting, but the mechanism under the hood differ from common strings dramatically. Many here didn't realize that this is not another formatting proposal, it is a new kind of *expression*. To have it in Python, we will need a new kind of syntax to distinguish it from other strings, such as raw strings and the like. A raw string looks like: >>> r'my\\ raw str' 'my raw str' A dynamic string may look like this: >>> name = "Peter" #common string >>> d"Thank you, $name$!" #dynamic string! 'Thank you, Peter!' The following example would make it feel a lot more safe (suppose a = raw_input()): >>> a = 'd"Are you $name$?"' >>> print(a) 'd"Are you $name$?"' >>> eval('d"Are you $name$?"') 'Are you Peter?' >>> d"It contains $len(_.split())$ words!" 'It contains 3 words!' An interesting question might be: what if a dynamic string is referring to another dynamic string, which in turn refers back to the former? The answer is: no variable can hold a dynamic string itself, only its result, which can only be a common string. However, a infinite recursion may occur if the eval function is used inside: >>> a = 'd"$eval(a)$"' >>> eval(a) This is just to show a dynamic string is really an expression in disguise. Like evaluating any expression containing function calls, there is risk of getting into infinite recursion. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
string interpolation for python
Hi all, I'd really like to share this idea of string interpolation for formatting. Let's start with some code: >>> name = "Shrek" >>> print( "Hi, $name$!") Hi, Shrek! >>> balls = 30 >>> print( "We have $balls$ balls.") We have 30 balls >>> persons = 5 >>> print ("And $persons$ persons.") And 5 persons. >>> print(" Each may get exactly $balls//persons$ balls.") Each may get exactly 6 balls. >>> fraction = 0.12345 >>> print( "The fraction is $fraction : 0.2f$!") The fraction is 0.12! >>> print("It costs about $$3, I think.") It costs about $3, I think. I think the rule is quite self explanatory. An expression is always enclosed between two '$'s, with an optional ':' followed by additional formatting specification. Double '$$' is like double '%%'. Of course, '$' can be replaced by anything else. For compatibility reasons, we might just use '%' as well. Implementation: the compiler might need to do something, say, to replace a string interpolation with an equivalent expression. Cheers, Yingjie-- http://mail.python.org/mailman/listinfo/python-list
the deceptive continuous assignments
Hi, I just figured out this with Python3.2 IDLE: >>> class k: pass >>> x=k() >>> x.thing = 1 >>> x.thing 1 >>> x = x.thing = 1 Traceback (most recent call last): File "", line 1, in x = x.thing = 1 AttributeError: 'int' object has no attribute 'thing' >>> x 1 >>> when I do x=x.thing=1, I thought it would be like in C, 1 is first assigned to x.thing, then it is further assigned to x. But what seems to be going on here is that 1 is first assigned to x, then to x.thing (which causes an error). Any reason why would Python deviate from C in this regard? Thanks! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: revive a generator
- Original Message - > From: Chris Angelico > To: python-list@python.org > Cc: > Sent: Friday, October 21, 2011 4:27 PM > Subject: Re: revive a generator > > On Fri, Oct 21, 2011 at 7:02 PM, Yingjie Lan wrote: >> What if the generator involves a variable from another scope, >> and before re-generating, the variable changed its value. >> Also, the generator could be passed in as an argument, >> so that we don't know its exact expression. >> > > There's actually no way to know that the generator's even > deterministic. Try this, for instance: > >>>> g=(input("Enter value %d or blank to stop: "%n) for n in > range(1,11)) >>>> for s in g: > if not s: break > print("Processing input: "+s) > > It may not be particularly useful, but it's certainly legal. And this > generator cannot viably be restarted. Depends on what you want. If you want ten more inputs from user, reviving this generator is certainly a good thing to do. > The only way is to cast it to > list first, but that doesn't work when you have to stop reading > expressions from the generator part way. > > What you could perhaps do is wrap the generator in something that > saves its values: > >>>> class restartable(object): > def __init__(self,gen): > self.gen=gen > self.yielded=[] > self.iter=iter(self.yielded) > def restart(self): > self.iter=iter(self.yielded) > def __iter__(self): > return self > def __next__(self): # Remove the underscores for Python 2 > try: > return self.iter.__next__() > except StopIteration: > pass > ret=self.gen.__next__() > self.yielded.append(ret) > return ret > >>>> h=restartable(g) >>>> for i in h: > if not i: break > print("Using: ",i) >>>> h.restart() >>>> for i in h: > if not i: break > print("Using: ",i) > > Complicated, but what this does is returns a value from its saved list > if there is one, otherwise returns a value from the original > generator. It can be restarted as many times as necessary, and any > time you read "past the end" of where you've read so far, the > original > generator will be called upon. > > Actually, this model might be useful for a repeatable random-number > generator. But those are more efficiently restarted by means of > reseeding the PRNG. > Sure. Or you would like to have the next few random numbers with the same PRNG. These two cases seem to be strong use cases for reviving a generator. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: revive a generator
- Original Message - > From: Paul Rudin > > I'm not really sure whether you intend g to yield the original values > after your "revive" or new values based on the new value of vo. But > still you can make a class that supports the iterator protocol and does > whatever you want (but you can't use the generator expression syntax). > > If you want something along these lines you should probably read up on > the .send() part of the generator protocol. > > As an aside you shouldn't really write code that uses a global in that > way.. it'll end up biting you eventually. > > Anyway... we can speculate endlessly about non-existent language > constructs, but I think we've probably done this one to death. > -- Maybe no new language construct is needed: just define that x.send() revives a generator. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: revive a generator
- Original Message - > From: Paul Rudin > To: python-list@python.org > Cc: > Sent: Friday, October 21, 2011 3:27 PM > Subject: Re: revive a generator > > > The language has no explicit notion of a request to "revive" a > generator. You could use the same syntax to make a new generator that > yeilds the same values as the one you started with if that's what you > want. > > As we've already discussed if you want to iterate several times over the > same values then it probably makes sense to compute them and store them > in e.g. a list (although there are always trade-offs between storage use > and the cost of computing things again). > > What if the generator involves a variable from another scope, and before re-generating, the variable changed its value. Also, the generator could be passed in as an argument, so that we don't know its exact expression. >>> vo = 34 >>> g = (vo*x for x in range(3)) >>> def myfun(g): for i in g: print(i) vo += 3 revive(g) #best if revived automatically for i in g: print(i) myfun(g) Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: revive a generator
- Original Message - > From: Paul Rudin > The language has no explicit notion of a request to "revive" a > generator. You could use the same syntax to make a new generator that > yeilds the same values as the one you started with if that's what you > want. > > As we've already discussed if you want to iterate several times over the > same values then it probably makes sense to compute them and store them > in e.g. a list (although there are always trade-offs between storage use > and the cost of computing things again). > Oops, my former reply has the code indentation messed up by the mail system. Here is a reformatted one: What if the generator involves a variable from another scope, and before re-generating, the variable changed its value. Also, the generator could be passed in as an argument, so that we don't know its exact expression. >>> vo = 34 >>> g = (vo*x for x in range(3)) >>> def myfun(g): for i in g: print(i) vo += 3 revive(g) #best if revived automatically for i in g: print(i) >>> myfun(g) Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: revive a generator
> Here's an example of an explicit request to revive the generator: > g = (x*x for x in range(3)) for x in g: print x > 0 > 1 > 4 g = (x*x for x in range(3)) # revive the generator for x in g: print x #now this will work > 0 > 1 > 4 > > ChrisA What if the generator is passed in as an argument when you are writing a function? That is, the expression is not available? Secondly, it would be nice to automatically revive it. For example, when another for-statement or other equivalent is applied to it. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: compare range objects
- Original Message - > From: Westley Martínez > To: python-list@python.org > Cc: > Sent: Friday, October 21, 2011 12:22 AM > Subject: Re: compare range objects > > There's already a discussion about this on python-ideas. But somebody > please tell me, why would you ever need to compare ranges? In simulation, one can use range objects to denote a discrete domain, and domain comparison could be very useful. Not just equality, but also things like if one domain is contained in another. -- http://mail.python.org/mailman/listinfo/python-list
Re: revive a generator
- Original Message - > From: Paul Rudin > To: python-list@python.org > Cc: > Sent: Thursday, October 20, 2011 10:28 PM > Subject: Re: revive a generator > > Yingjie Lan writes: > >> Hi, >> >> it seems a generator expression can be used only once: >> >>>>> g = (x*x for x in range(3)) >>>>> for x in g: print x >> 0 >> 1 >> 4 >>>>> for x in g: print x #nothing printed >>>>> >> >> Is there any way to revive g here? >> > > Generators are like that - you consume them until they run out of > values. You could have done [x*x for x in range(3)] and then iterated > over that list as many times as you wanted. > > A generator doesn't have to remember all the values it generates so it > can be more memory efficient that a list. Also it can, for example, > generate an infinite sequence. > > Thanks a lot to all who answered my question. I am still not sure why should we enforce that a generator can not be reused after an explicit request to revive it? -- http://mail.python.org/mailman/listinfo/python-list
compare range objects
Hi, Is it possible to test if two range objects contain the same sequence of integers by the following algorithm in Python 3.2? 1. standardize the ending bound by letting it be the first excluded integer for the given step size. 2. compare the standardized starting bound, ending bound and step size: two ranges equal if and only if this triplet is the same. If that's correct, it would be good to have equality comparison on two ranges. Further, it might also be good to have sub-sequence test on ranges without enumerating it. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
revive a generator
Hi, it seems a generator expression can be used only once: >>> g = (x*x for x in range(3)) >>> for x in g: print x 0 1 4 >>> for x in g: print x #nothing printed >>> Is there any way to revive g here? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: How to test if object is an integer?
- Original Message - > From: Noah Hall > To: MrPink > Cc: python-list@python.org > Sent: Tuesday, October 18, 2011 4:44 AM > Subject: Re: How to test if object is an integer? > There's the isdigit method, for example - > str = "1324325" str.isdigit() > True str = "1232.34" str.isdigit() > False str = "I am a string, not an int!" str.isdigit() > False > There are some corner cases to be considered with this approach: 1. negative integers: '-3' 2. strings starting with '0': '03' 3. strings starting with one '+': '+3' -- http://mail.python.org/mailman/listinfo/python-list
strange comparison result with 'is'
Hi all, This is quite strange when I used the Python shell with IDLE: >>> x = [] >>> id(getattr(x, 'pop')) == id(x.pop) True >>> getattr(x, 'pop') is x.pop False >>> I suppose since the two things have the same id, the 'is'-test should give a True value, but I get a False value. Any particular reason for breaking this test? I am quite confused as I show this before a large audience only to find the result denies my prediction. The python version is 3.2.2; I am not sure about other versions. Regards, Yingjie-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
> Thanks, I think that's the rule described in its full glory. Currently I am not quite sure of the use case for continuation nested in continuation -- it seems to be still a single continuation, but it allows for some additional freedom in formatting the continued line. Do you have other use cases for that? For the case of mis-indentation, as demonstrated in your scenario, I think it is better that the rule is not applied to a comment continued onto the next line. The effect of a '#' only carries to the end of a line, if one would like the next line to be a comment, just use another '#'. It remains to consider a mis-indentation that only involves code lines. However, that is not a new problem, so we should not worry (it is like a sunken cost). Sorry for top posting. It seems yahoo web email is not designed with such a use case in mind. > >From: MRAB >To: python-list@python.org >Sent: Sunday, September 4, 2011 10:04 AM >Subject: Re: [Python-ideas] allow line break at operators > >On 04/09/2011 00:22, Yingjie Lan wrote: >>> Every language with blocks needs some mechanism to indicate the >> beginning and ending of blocks and of statements within blocks. If >> visible fences ('begin/end' or '{}') and statement terminators (';') are >> used, then '\n' can be treated as merely a space, as it is in C, for >> instance. >>> and it uses unescaped '\n' (with two escapement options) to terminate >> statements. This is fundamental to Python's design and goes along with >> significant indents. >> >> Agreed. Currently indentation in Python starts a new block, but if you >> view it from the perspective of line breaking, it also functions as if >> the line is continued. The line of code below >> >> if condition: do_a(); do_b() >> >> can be written as: >> >> if condition: #line breaks >> do_a(); # ';' is optional here >> do_b() # continue >> >> That indentation can be also employed for line breaking is quite evident >> to me. During the open email correspondence with Stephen, it seems to be >> a tenable point. >> >> > There would then be three ways to escape newline, with one doing >> double duty. And for what? Merely to avoid using either of the two >> methods already available. >> >> I believe the other two ways are not as good as this new way. As the >> proposal is fully backward compatible, people may choose whatever way >> they prefer. >> >I think that the rules would be: > >If a line ends with a colon and the next line is indented, then it's >the start of a block, and the following lines which belong to that >block have the same indent. > >If a line doesn't end with a colon but the next line is indented, then >it's the start of a continuation, and the following lines which belong >to that continuation have the same indent. > >In both cases there could be blocks nested in blocks and possibly >continuations nested in continuations, as well as blocks nested in >continuations and continuations nested in blocks. > >I'm not sure what the effect would be if you had mis-indented lines. >For example, if a line was accidentally indented after a comment, then >it would be treated as part of the comment. It's in cases like those >that syntax colouring would be helpful. It would be a good idea to use >an editor which could indicate in some way when a line is a >continuation. > >> >> *From:* Terry Reedy >> *To:* python-list@python.org >> *Cc:* python-id...@python.org >> *Sent:* Sunday, September 4, 2011 3:01 AM >> *Subject:* Re: [Python-ideas] allow line break at operators >> >> On 9/3/2011 3:51 AM, Yingjie Lan wrote: >> > I agree that long lines of code are not very common in many projects, >> > though it might be the case with some heavily involved in math. >> For some >> > reason, when the feature of free line breaking came about in computer >> > languages, it is welcomed and generally well accepted. >> >> Every language with blocks needs some mechanism to indicate the >> beginning and ending of blocks and of statements within blocks. If >> visible fences ('begin/end' or '{}') and statement terminators (';') >> are used, then '\n' can be treated as merely a space, as it is in C, >> for instance. >> >> > Python uses indentation for blocks, >>
Re: [Python-ideas] allow line break at operators
> Every language with blocks needs some mechanism to indicate the beginning and >ending of blocks and of statements within blocks. If visible fences >('begin/end' or '{}') and statement terminators (';') are used, then '\n' can >be treated as merely a space, as it is in C, for instance. > and it uses unescaped '\n' (with two escapement options) to terminate >statements. This is fundamental to Python's design and goes along with >significant indents. Agreed. Currently indentation in Python starts a new block, but if you view it from the perspective of line breaking, it also functions as if the line is continued. The line of code below if condition: do_a(); do_b() can be written as: if condition: #line breaks do_a(); # ';' is optional here do_b() # continue That indentation can be also employed for line breaking is quite evident to me. During the open email correspondence with Stephen, it seems to be a tenable point. > There would then be three ways to escape newline, with one doing double duty. >And for what? Merely to avoid using either of the two methods already >available. I believe the other two ways are not as good as this new way. As the proposal is fully backward compatible, people may choose whatever way they prefer. > >From: Terry Reedy >To: python-list@python.org >Cc: python-id...@python.org >Sent: Sunday, September 4, 2011 3:01 AM >Subject: Re: [Python-ideas] allow line break at operators > >On 9/3/2011 3:51 AM, Yingjie Lan wrote: >> I agree that long lines of code are not very common in many projects, >> though it might be the case with some heavily involved in math. For some >> reason, when the feature of free line breaking came about in computer >> languages, it is welcomed and generally well accepted. > >Every language with blocks needs some mechanism to indicate the beginning and >ending of blocks and of statements within blocks. If visible fences >('begin/end' or '{}') and statement terminators (';') are used, then '\n' can >be treated as merely a space, as it is in C, for instance. > >> Python uses indentation for blocks, > >and it uses unescaped '\n' (with two escapement options) to terminate >statements. This is fundamental to Python's design and goes along with >significant indents. > >> and by the same mechanism, line breaking can be >> accommodated without requiring parenthesis or ending backslashes. > >You need proof for your claim that indentation can be used for both jobs in >the form of a grammar that works with Python's parser. I am dubious that you >can do that with an indents *after* the newline. > >Even if you could, it would be confusing for human readers. There would then >be three ways to escape newline, with one doing double duty. And for what? >Merely to avoid using either of the two methods already available. > >-- Terry Jan Reedy > >-- http://mail.python.org/mailman/listinfo/python-list > > >-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
Oops, the generating text part of my reply is referring to your last code example. For literal texts, it is is not governed by this proposal, nor are expressions within brackets and backslash continued lines. In a word, this proposal is fully backward compatible. > >From: Yingjie Lan >To: Stephen J. Turnbull >Cc: python list ; Gabriel AHTUNE ; >python-ideas ; Matt Joiner >Sent: Saturday, September 3, 2011 6:33 PM >Subject: Re: [Python-ideas] allow line break at operators > > >Ambiguity: yes, when the last line of a suite is a continued line, it would >require double dedentations to end the line and the suite. I noticed a similar >case in current Python language as well: > > >== >#BEGIN CODE 1 >if condition: >for i in range(5): >triangulate(i) >else: #double dedentations >for body in space: >triangulate(body) >#double dedentations again >log('triangulation done') >#END CODE 1 >== > > > >If lines can be continued by indentation, similar situation would rise: > > >== >#BEGIN CODE 2 >if condition: >result = [sin(i) for i in range(5)] >+ [cos(i) for i in range(5)] >else: > >result = [cos(i) for i in range(5)] >+ [sin(i) for i in range(5)] > > >log('triangulation done') >#END CODE 2 >== > > > >Generating text example: right, this is a case that can't be handled by >standard indentation, unless we only consider full dedentation (dedentation to >the exact level of the initial indentation) as the signal of ending the line. >Whether to accommodate for such a case might be an issue of debate, but at >least we can have such 'freedom' :) > > > > > >>____ >>From: Stephen J. Turnbull >>To: Yingjie Lan >>Cc: Gabriel AHTUNE ; Matt Joiner ; >>python-ideas >>Sent: Saturday, September 3, 2011 5:29 PM >>Subject: Re: [Python-ideas] allow line break at operators >> >>Yingjie Lan writes: >> >>> Python uses indentation for blocks, and by the same mechanism, line >>> breaking can be accommodated without requiring parenthesis or >>> ending backslashes. >> >>Possibly, but now you have a problem that a dedent has ambiguous >>meaning. It might mean that you're ending a suite, or it might mean >>you're ending a continued expression. This probably can be >>disambiguated, but I don't know how easy that will be to do perfectly, >>including in reporting ill-formed programs. >> >>> Most people seems to like an indentation on the continuing lines, >> >>Most of the time, yes, but sometimes not. For example, in generating >>text, it's often useful to dedent substantially so you can have a >>nearly normal length line in the literal strings being concatenated. >>Or you might have a pattern like this: >> >> x = long_named_variable_a >> - long_named_variable_a_base >> + long_named_variable_b >> - long_named_variable_b_base >> >>which your parser would raise an error on, I presume. That's not >>freedom! >> >> >> >> >-- >http://mail.python.org/mailman/listinfo/python-list > > >-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
Ambiguity: yes, when the last line of a suite is a continued line, it would require double dedentations to end the line and the suite. I noticed a similar case in current Python language as well: == #BEGIN CODE 1 if condition: for i in range(5): triangulate(i) else: #double dedentations for body in space: triangulate(body) #double dedentations again log('triangulation done') #END CODE 1 == If lines can be continued by indentation, similar situation would rise: == #BEGIN CODE 2 if condition: result = [sin(i) for i in range(5)] + [cos(i) for i in range(5)] else: result = [cos(i) for i in range(5)] + [sin(i) for i in range(5)] log('triangulation done') #END CODE 2 == Generating text example: right, this is a case that can't be handled by standard indentation, unless we only consider full dedentation (dedentation to the exact level of the initial indentation) as the signal of ending the line. Whether to accommodate for such a case might be an issue of debate, but at least we can have such 'freedom' :) > >From: Stephen J. Turnbull >To: Yingjie Lan >Cc: Gabriel AHTUNE ; Matt Joiner ; >python-ideas >Sent: Saturday, September 3, 2011 5:29 PM >Subject: Re: [Python-ideas] allow line break at operators > >Yingjie Lan writes: > >> Python uses indentation for blocks, and by the same mechanism, line >> breaking can be accommodated without requiring parenthesis or >> ending backslashes. > >Possibly, but now you have a problem that a dedent has ambiguous >meaning. It might mean that you're ending a suite, or it might mean >you're ending a continued expression. This probably can be >disambiguated, but I don't know how easy that will be to do perfectly, >including in reporting ill-formed programs. > >> Most people seems to like an indentation on the continuing lines, > >Most of the time, yes, but sometimes not. For example, in generating >text, it's often useful to dedent substantially so you can have a >nearly normal length line in the literal strings being concatenated. >Or you might have a pattern like this: > > x = long_named_variable_a > - long_named_variable_a_base > + long_named_variable_b > - long_named_variable_b_base > >which your parser would raise an error on, I presume. That's not >freedom! > > > >-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
I agree that long lines of code are not very common in many projects, though it might be the case with some heavily involved in math. For some reason, when the feature of free line breaking came about in computer languages, it is welcomed and generally well accepted. Python uses indentation for blocks, and by the same mechanism, line breaking can be accommodated without requiring parenthesis or ending backslashes. For the tuning, yes, people would disagree on how to split expressions/code. The continue-by-indentation would allow people to break a line in whatever way that pleases their aesthetic taste, as long as there is an indentation. Most people seems to like an indentation on the continuing lines, probably for a visual indication of a continuation line. Some general guidelines may be provided, but there is no need for other hard rules on breaking lines, except that an identifier should never be split apart. For the implementation, I don't have much clue. At least on the parser, it needs to look beyond the linefeed to determine if a line is completed. If the indentation is defined as a single symbol, then it would only require a one-step look-ahead, and that should not be hard. Again, my apology for top posting. > >From: Stephen J. Turnbull >To: Yingjie Lan >Cc: Gabriel AHTUNE ; Matt Joiner ; >"python-list@python.org" ; python-ideas > >Sent: Saturday, September 3, 2011 2:10 PM >Subject: Re: [Python-ideas] allow line break at operators > >Yingjie Lan writes: > >> Have you considered line continuation by indentation? It seems to >> meet the design principle. I think it is the most natural way to >> allow free line breaking in Python. > >Briefly, yes, and I think it would need a lot of tuning and probably >complex rules. Unlike statements, where everybody (except the judges >of the Obfuscated C Contest) agrees on a simple rule: "In a control >structure, the controlled suite should be uniformly indented one >level", line breaking and indentation of long expressions is an art, >and people have different opinions on "readability" and "beauty." >Achieving a compromise that is workable even for a few major styles is >likely to be annoying and bug-prone. > >Pretty much every program I write seems to have a continued list of >data or a multi-line dictionary display as data. It's not unusual for >me to comment the formal arguments in a function definition, or the >parent classes of a class definition. The exception for parenthesized >objects is something I depend on for what I consider good style. Of >course I could use explicit continuation, but in a long table that's >ugly and error-prone. > >Long expressions that need to be broken across lines, on the other >hand, often indication that I haven't thought carefully enough about >that component of the program, and an extra pair of parentheses or a >terminal backslash just isn't that "heavy" or ugly in the context of >such long expressions. For me, they're also pretty rare; many >programs I write have no explicit continuations in them at all. > >YMMV, of course, but I find the compromise that Python arrived at to >be very useful, and I must suppose that it was substantially easier to >implement than "fully free" line breaking (whatever that means to you). > > >-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
Have you considered line continuation by indentation? It seems to meet the design principle. I think it is the most natural way to allow free line breaking in Python. (Sorry, the yahoo web email interface is so weird that I don't know how to format comments between the quoted message below.) > >From: Stephen J. Turnbull >To: Gabriel AHTUNE >Cc: Matt Joiner ; "python-list@python.org" >; python-ideas ; Yingjie Lan > >Sent: Friday, September 2, 2011 3:28 PM >Subject: Re: [Python-ideas] allow line break at operators > >Gabriel AHTUNE writes: >> So can be done with this syntax: >> >> > x = firstpart * secondpart + #line breaks here >> > anotherpart + #continue >> > stillanother #continue on. >> >> after a "+" operator the line is clearly not finished yet. > >Sure, but IIRC one design principle of Python is that the keyword that >denotes the syntax should be the first thing on the line, making it >easy to scan down the left side of the code to see the syntactic >structure. The required indentation of the controlled suite also >helps emphasize that keyword. > >Analogously, if operators are going to denote continuation, they >should come first on the line. > > > >I just don't think this idea is going anywhere. Explicit continuation >with backslash or implicit continuation of parenthesized expressions >is just not that heavy a price to pay. Perhaps historically some of >these ideas could have been implemented, but now they're just going to >confuse a host of editors and code analysis tools. > >-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
Yeah, that might be a challenge for the Python interpreter, for it has to check if the next line is indented or not. But it might be worthwhile to take this trouble, so that the coder has more freedom, and the code is hopefully better to read. From: Matt Joiner To: Yingjie Lan Cc: "python-list@python.org" ; python-ideas Sent: Friday, September 2, 2011 1:33 PM Subject: Re: [Python-ideas] allow line break at operators I guess the issue here is that you can't tell if an expression is complete without checking the indent of the following line. This is likely not desirable. On Thu, Sep 1, 2011 at 11:43 PM, Yingjie Lan wrote: > Hi Matt, > === > From: Matt Joiner > > The "trailing \" workaround is nonobvious. Wrapping in () is noisy and > already heavily used by other syntactical structures. > === > How about only require indentation > to freely break lines? Here is an example: > x = firstpart * secondpart #line breaks here > + anotherpart #continue by indentation > + stillanother #continue on. > #until here, another line starts by dedentation > y = some_expression - another_one > All this would be completely compatible with former code, while > having almost free line breaking! Plus, indentation makes it pretty. > Really hope Python can have freedom in breaking lines. > Yingjie-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
Hi Gabriel, == From: Gabriel AHTUNE Subject: Re: [Python-ideas] allow line break at operators So can be done with this syntax: > x = firstpart * secondpart + #line breaks here > anotherpart + #continue > stillanother #continue on. after a "+" operator the line is clearly not finished yet. Gabriel AHTUNE == That's good to me too, which I proposed early in this thread. Then somebody would like to have the operator in the beginning of the next line so that it would stand out.Then still another one said that indentation is good here. So I finally proposed line continuation with indentation. Since this is Python, we will live with indentation. Currently indentation in Python starts a new block, but if you view it from the perspective of line breaking, it also function as if the line is continued. The line if condition: do_a(); do_b() can be written as: if condition: #line breaks do_a(); # ';' is optional here do_b() # continue Sounds a pretty natural way to allow free line breaking. Yingjie-- http://mail.python.org/mailman/listinfo/python-list
Re: [Python-ideas] allow line break at operators
Hi Matt, === From: Matt Joiner The "trailing \" workaround is nonobvious. Wrapping in () is noisy and already heavily used by other syntactical structures. === How about only require indentation to freely break lines? Here is an example: x = firstpart * secondpart #line breaks here + anotherpart #continue by indentation + stillanother #continue on. #until here, another line starts by dedentation y = some_expression - another_one All this would be completely compatible with former code, while having almost free line breaking! Plus, indentation makes it pretty. Really hope Python can have freedom in breaking lines. Yingjie-- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
From: Vito 'ZeD' De Tullio :umm... besides "notepad" pretty much any other serious "programmer editor" :program try to do its best to deal with line wrap: the minimal I found is :the wrapped line is "indented" at the same level of the flow, but I found :editors where you can specify what to do (generally something like "indent :the wrapped part 2 levels" or something like that) Well, even if one editor can do perfect line wrapping, breaking the line at places perfectly pleasing to the eye (put aside the fact that sometimes the line breaking could be based on the meaning of the code), it is unlikely to be cross-editor consistent. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
> :The trouble of dealing with long lines can be avoided by a smart > :editor. It's called line wrap. > > Yeah, usually they just wrap it pretty arbitrarily, > and you don't have any control, isn't it? :umm... besides "notepad" pretty much any other serious "programmer editor" :program try to do its best to deal with line wrap: the minimal I found is :the wrapped line is "indented" at the same level of the flow, but I found :editors where you can specify what to do (generally something like "indent :the wrapped part 2 levels" or something like that) Thanks for sharing that, which I am not quite aware of . BTW, do you think things like eclipse, emacs and vim also has such kind of functionality? Best of all, would certainly like to have IDLE have it, as I am teaching Python and would like to get them to start with a simple environment. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
From: Chris Rebert To: Yingjie Lan Cc: "python-list@python.org" Sent: Thursday, August 11, 2011 3:50 PM Subject: Re: allow line break at operators On Thu, Aug 11, 2011 at 12:24 AM, Yingjie Lan wrote: > From: Steven D'Aprano > On Thu, 11 Aug 2011 12:52 pm Yingjie Lan wrote: > >> :And if we require {} then truly free indentation should be OK too! But >> >> :it wouldn't be Python any more. >> >> Of course, but not the case with ';'. Currently ';' is optional in Python, >> But '{' is used for dicts. Clearly, ';' and '{' are different in >> magnitude. >> >> So the decision is: shall we change ';' from optional to mandatory >> to allow free line splitting? > > :Why on earth would you want to break backwards compatibility of millions of > :Python scripts and programs, and require extra, unnecessary line-noise on > :every single line of Python code, just so that you can occasionally avoid a > :writing a pair of parentheses? > > I think allowing free line splitting (without parentheses -- that's > artifitial and > requires the coder to serve the machine rather than the other way around) > with proper indentation will produce truely ergonomic code layout (well, > assuming you also like properly indented code). > > And this can be done almost hassle-free for the coder. > The trouble of adding a ';' to most of the lines can also be > avoided by a smart editor (see my other reply). :The trouble of dealing with long lines can be avoided by a smart :editor. It's called line wrap. Yeah, usually they just wrap it pretty arbitrarily, and you don't have any control, isn't it? Cheers, Yingjie-- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
From: Steven D'Aprano To: python-list@python.org Sent: Thursday, August 11, 2011 12:18 PM Subject: Re: allow line break at operators On Thu, 11 Aug 2011 12:52 pm Yingjie Lan wrote: > :And if we require {} then truly free indentation should be OK too! But > > :it wouldn't be Python any more. > > Of course, but not the case with ';'. Currently ';' is optional in Python, > But '{' is used for dicts. Clearly, ';' and '{' are different in > magnitude. > > So the decision is: shall we change ';' from optional to mandatory > to allow free line splitting? :Why on earth would you want to break backwards compatibility of millions of :Python scripts and programs, and require extra, unnecessary line-noise on :every single line of Python code, just so that you can occasionally avoid a :writing a pair of parentheses? I think allowing free line splitting (without parentheses -- that's artifitial and requires the coder to serve the machine rather than the other way around) with proper indentation will produce truely ergonomic code layout (well, assuming you also like properly indented code). And this can be done almost hassle-free for the coder. The compatibility problem can be solved by something like a preprocessor indicator to opt in this new language feature, or, if we are determined to favor this new feature, all old code can be easily converted. The worst case is that the coder has to opt in. The trouble of adding a ';' to most of the lines can also be avoided by a smart editor (see my other reply). Python serves the coder, not the other way around. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
From: Michael Trausch To: Yingjie Lan Cc: Chris Angelico ; "python-list@python.org" Sent: Thursday, August 11, 2011 12:51 PM Subject: Re: allow line break at operators > Perhaps it could be made an optional thing to enable; for example, some > languages by default do dynamic typing, but with an option contained as the > first statement of the file can enforce static typing. That is a brilliant idea! Python code can specify encoding in the beginning, we might use another similar line to opt in for that kind of language features. Once in that ';'-required mode, the trouble of typing a ';' at end of almost every line can be easily avoided by a smart editor: 1) when a single 'Return' key is hit, and the line is not ending in ':' or ';' (white spaces and comments discarded), automatically append a ';' to the end of the line. 2) to continue the line to the next line, hit "Shift+Enter", then no ';' will be appended to the line. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
On Wed, Aug 10, 2011 at 1:58 PM, Yingjie Lan wrote: > Is it possible for python to allow free splitting of single-line statements > without the backslashes, if we impose that expressions can only be split > when it is not yet a finished expression? :The trouble is that in a lot of cases, the next statement after an :unfinished expression could conceivably be a continuation of it. If :this were permitted, it would have to also require that the :continuation lines be indented, to avoid the problem described above. :It'd still have the potential to mis-diagnose errors, though. Indentation is a good idea to reduce the likelihood of such troubles. and also produce code that is easier to read. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
:And if we require {} then truly free indentation should be OK too! But :it wouldn't be Python any more. Of course, but not the case with ';'. Currently ';' is optional in Python, But '{' is used for dicts. Clearly, ';' and '{' are different in magnitude. So the decision is: shall we change ';' from optional to mandatory to allow free line splitting? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
>> In terms of easier to read, I find code easier to read when the >> operators are at the beginnings of the lines (PEP 8 notwithstanding): >> >> x = (someobject.somemethod(object3, thing) >> + longfunctionname(object2) >> + otherfunction(value1, value2, value3)) >> > > Without the parentheses, this is legal but (probably) useless; it > applies the unary + operator to the return value of those functions. If ';' are employed (required), truely free line-splitting should be OK, the operators may appear at the beginnings of the lines as you wish. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
> On Wed, Aug 10, 2011 at 10:56 AM, Dan Sommers > wrote: >> In terms of easier to read, I find code easier to read when the >> operators are at the beginnings of the lines (PEP 8 notwithstanding): >> >> x = (someobject.somemethod(object3, thing) >> + longfunctionname(object2) >> + otherfunction(value1, value2, value3)) >> > > Without the parentheses, this is legal but (probably) useless; it > applies the unary + operator to the return value of those functions. :No, in some other languages it might be legal, but this is Python. :without the parentheses it is a syntax error. This discussion leads me to this question: Is it possible for python to allow free splitting of single-line statements without the backslashes, if we impose that expressions can only be split when it is not yet a finished expression? Note: splitting before closing parenthis, brace, or bracket can be viewed as special case of this more general rule. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: allow line break at operators
On Tue, Aug 9, 2011 at 9:42 PM, Yingjie Lan wrote: > Hi all, > > When writing a long expresion, one usually would like to break it into > multiple lines. Currently, you may use a '\' to do so, but it looks a little > awkward (more like machine-oriented thing). Therefore I start wondering why > not allow line breaking at an operator, which is the standard way of breaking > a long expression in publication? Here is an example: > > #the old way > > x = 1+2+3+4+\ > 1+2+3+4 > > #the new way > x = 1+2+3+4+ #line continues as it is clearly unfinished > > 1+2+3+4 :# the currently allowed way :x = (1+2+3+4+ : 1+2+3+4) :# note the parentheses : :I think this is sufficient. That works, but not in the most natural way--the way people are customed to...why require a pair of parenthis when we can do without them? Also, the new way does not affect the old ways of doing things at all, it is fully backward compatible. So this just offers a new choice. > Of course, the dot operator is also included, which may facilitate method > chaining: > > x = svg.append( 'circle' ). > r(2).cx(1).xy(1). > foreground('black').bkground('white') :Also, I dislike this for the dot operator especially, as it can :obscure whether a method call or a function call is taking place. Again, this only offers a new choice, and does not force anybody to do it this way. cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
allow line break at operators
Hi all, When writing a long expresion, one usually would like to break it into multiple lines. Currently, you may use a '\' to do so, but it looks a little awkward (more like machine-oriented thing). Therefore I start wondering why not allow line breaking at an operator, which is the standard way of breaking a long expression in publication? Here is an example: #the old way x = 1+2+3+4+\ 1+2+3+4 #the new way x = 1+2+3+4+ #line continues as it is clearly unfinished 1+2+3+4 Of course, the dot operator is also included, which may facilitate method chaining: x = svg.append( 'circle' ). r(2).cx(1).xy(1). foreground('black').bkground('white') Thoughts? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
anonymous function with multiple statements
Hi all, I wonder if Python provides a way to define anonymous functions containing multiple statements? With lambda form, we can only define a function of a single expression. In Javascript, it is possible to define a full-fledged anonymous functions, which suggests it is useful to have it. In Python, it might be like this: #== def test( fn, K ): return sum(fn(i) for i in range(K))/K test( def (i): #indent from start of this line, not from 'def' from math import sin #the last statement ends with a ',' or ')'. return sin(i)+i*i;, 100) #';' to mark end of statement #another way: test( def (i): #indent from start of this line, not from 'def' from math import sin return sin(i)+i*i , 100) #freely place ',' anywhere #=== Any thoughts? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: having both dynamic and static variables
- Original Message From: Steven D'Aprano To: python-list@python.org Sent: Thu, March 3, 2011 1:27:01 PM Subject: Re: having both dynamic and static variables On Wed, 02 Mar 2011 19:45:16 -0800, Yingjie Lan wrote: > Hi everyone, > > Variables in Python are resolved dynamically at runtime, which comes at > a performance cost. However, a lot of times we don't need that feature. > Variables can be determined at compile time, which should boost up > speed. :This is a very promising approach taken by a number of projects. Thanks, that's good to know -- so people are doing this already! :Finally, Python 3 introduced type annotations, which are currently a :feature looking for a reason. I have a use for that feature -- I have a project that help build the scaffold for people to extend CPython. See http://expy.sf.net/ It is only good for Python 2 at this moment. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
having both dynamic and static variables
Hi everyone, Variables in Python are resolved dynamically at runtime, which comes at a performance cost. However, a lot of times we don't need that feature. Variables can be determined at compile time, which should boost up speed. Therefore, I wonder if it is a good idea to have static variables as well. So at compile time, a variable is determined to be either static or dynamic (the reference of a static varialbe is determined at compile time -- the namespace implementation will consist of two parts, a tuple for static variables and a dict for dynamic ones). The resolution can be done at the second pass of compilation. By default, variables are considered static. A variables is determined dynamic when: 1. it is declared dynamic; 2. it is not defined locally and the nearest namespace has it declared dynamic. A static variable can't be deleted, so a deleted variable must be a dynamic one: we can either enforce that the variable must be explicitly declared or allow a del statement to implicitly declare a dynamic variable. Any thoughts? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
the C header file when extending CPython
Hi, I am wondering when extending Python (CPython), what should be put into the C header file? Any guidelines? Thanks, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: group 0 in the re module
: Use \g<0>. Thanks! Though I wish all \1, \2, ..., should also be forbidden. Such a mixture of things looks like a patch work. No offense meant. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
group 0 in the re module
Hi, According to the doc, group(0) is the entire match. >>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist") >>> m.group(0) # The entire match 'Isaac Newton' But if you do this: >>> import re >>> re.sub(r'(\d{3})(\d{3})', r'\0 to \1-\2', '757234') '\x00 to 757-234' where I expected '757234 to 757-234' Then I found that in python re '\0' is considered an octal number. So, is there anyway to refer to the entire match by an escaped notation? Thanks, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
--- On Tue, 11/30/10, Dan Stromberg wrote: > In Python 3, I'm finding that I have encoding issues with > characters > with their high bit set. Things are fine with strictly > ASCII > filenames. With high-bit-set characters, even if I > change stdin's > encoding with: Co-ask. I have also had problems with file names in Chinese characters with Python 3. I unzipped the turtle demo files into the desktop folder (of course, the word 'desktop' is in Chinese, it is a windows XP system, localization is Chinese), then all in a sudden some of the demos won't work anymore. But if I move it to a folder whose path contains only english characters, everything comes back to normal. Another related issue on the same platform is this: if you have your source file in the chinese 'desktop' folder, and when you run the code and got some exceptions, the path information printed out by the tracer wrongly encoded the chinese name for 'desktop'. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: regular expression help
--- On Tue, 11/30/10, goldtech wrote: > From: goldtech > Subject: regular expression help > To: python-list@python.org > Date: Tuesday, November 30, 2010, 9:17 AM > The regex is eating up too much. What I want is every > non-overlapping > occurrence I think. > > so rtt would be: > > '||flfllff||ooo' > Hi, I'll just let Python do most of the talk here. >>> import re >>> m="cccvlvlvlvnnnflfllffccclfnnnooo" >>> p=re.compile(r'ccc.*?nnn') >>> p.sub("||", m) '||flfllff||ooo' Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: what a cheap rule
--- On Fri, 11/26/10, Steven D'Aprano wrote: > From: Steven D'Aprano > Subject: Re: what a cheap rule > To: python-list@python.org > Date: Friday, November 26, 2010, 5:10 AM > On Thu, 25 Nov 2010 08:15:21 -0800, > Yingjie Lan wrote: > > You seem to have misunderstood both forms of the raise > statement. Should > we make exceptions illegal because you can't correctly > guess what they do? > > Though what you said about Python is right, I think somehow you missed my point a bit. Ideally, the language could be so 'natural' that it means what meets the eyes. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: tilted text in the turtle module
--- On Fri, 11/26/10, Steve Holden wrote: > From: Steve Holden > Subject: Re: tilted text in the turtle module > To: python-list@python.org > Date: Friday, November 26, 2010, 4:16 AM > On 11/25/2010 5:58 PM, Yingjie Lan > wrote: > > --- On Thu, 11/25/10, Steve Holden > wrote: > >>> And even if I made a patch, > >>> then how to publish it? > >>> > >> Once you have a patch, attach it to the issue as a > file and > >> try and get > >> it reviewed by a developer for incorporation into > a future > >> release. > >> > >> Note that no Python 2.8 release is planned, so you > would > >> best > >> concentrate your effort on the 3.x series. > >> > > > > I see. I suppose one could post a message somewhere > > to get the attention? > > > > Thanks for the pointer. > > > One would normally make a post on the python-dev list to > get the > attention of developers. If you want to *guarantee* that > your issue gets > attention then you can review five existing issues and > advance them a > little to help the development team. > Sound advices. Many thanks! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: the buggy regex in Python
--- On Thu, 11/25/10, MRAB wrote: > > > Look at the spans: > > >>> for m in re.finditer('((.d.)*)*', 'adb'): > print(m.span()) > > > (0, 3) > (3, 3) > > There's an non-empty match followed by an empty match. If you read my first post, it should be apparent that that the empty string in the end of the string is used twice -- thus an overlap. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: tilted text in the turtle module
--- On Thu, 11/25/10, Steve Holden wrote: > > And even if I made a patch, > > then how to publish it? > > > Once you have a patch, attach it to the issue as a file and > try and get > it reviewed by a developer for incorporation into a future > release. > > Note that no Python 2.8 release is planned, so you would > best > concentrate your effort on the 3.x series. > I see. I suppose one could post a message somewhere to get the attention? Thanks for the pointer. Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: tilted text in the turtle module
--- On Thu, 11/25/10, Steve Holden wrote: > From: Steve Holden > Subject: Re: tilted text in the turtle module > To: python-list@python.org > Date: Thursday, November 25, 2010, 7:00 PM > On 11/25/2010 5:06 AM, Yingjie Lan > wrote: > This sounds like a good idea. To request a feature you > should create an > account (if you do not already have one) on bugs.python.org > and create a > new issue (assuming a search reveals that there is not > already such an > issue). > Thanks I just did that. > You may find if you look at the module's code that you can > imagine how > to make the change. If not, the request will wait > until some maintainer > sees it and has time. > I don't know much about tkinter, not sure if I can contribute. And even if I made a patch, then how to publish it? Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: the buggy regex in Python
--- On Thu, 11/25/10, MRAB wrote: > re.findall performs multiple searches, each starting where > the previous > one finished. The first match started at the start of the > string and > finished at its end. The second match started at that point > (the end of > the string) and found another match, ending at the end of > the string. > It tried to match a third time, but that failed because it > would have > matched an empty string again (it's not allowed to return 2 > contiguous > empty strings). > > > Isn't this a bug? > > > No, but it can be confusing at times! :-) > -- But the last empty string is matched twice -- so it is an overlapping. But findall is supposed not to return overlapping matches. So I think this does not live up to the documentation -- thus I still consider it a bug. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: what a cheap rule
--- On Thu, 11/25/10, Steve Holden wrote: > > Sometimes the golden rule in Python of > > "explicit is better than implicit" is > > so cheap that it can be thrown away > > for the trouble of typing an empty tuple. > > > I'm not sure that there *are* any golden rules. The "Zen of > Python" is > intended to be guidelines, not rigid rules intended to > constrain your > behavior but advice to help you write better code. > > > Surely an exaggeration. In fact current best practice > (which you should > inform yourself of as best you can to help you in your > teaching work - > so you are to be congratulated for bringing this question > to the list) > is to always use explicit calls, with arguments specifying > a tailored > message. > > regards > Steve A very cogent message -- the end echos the start. :) I must say that I learned from you a new angle to think about this issue. On the other hand, I still feel that when allowing both ways colliding into the simpleness and bueaty of the language, we should consider to make a decision. Sure, this introduced quite a lot of complexity when the doc has to give a very long explanation of what is happening in order to justify it. As I am thinking about it, it seems two conflicting intuition of code comprehension are at work here: Intuition #1: as if you raise an exception type, and then match that type. It seems that no instances are involved here (Intuitively). See an example code here: try: raise KeyError except KeyError: pass Intuition #2: you raise an exception instance, and then match an instance by its type. See an example code here: try: raise KeyError() except KeyError as ke: pass Those two comprehensions are not compatible, and thus the one that promotes correct understanding should be encouraged, while the other should be discouraged, and maybe even be made iliegal. Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
the buggy regex in Python
I know many experts will say I don't have understanding...but let me pay this up front as my tuition. Here are some puzzling results I have got (I am using Python 3, I suppose similar results for python 2). When I do the following, I got an exception: >>> re.findall('(d*)*', 'adb') >>> re.findall('((d)*)*', 'adb') When I do this, I am fine but the result is wrong: >>> re.findall('((.d.)*)*', 'adb') [('', 'adb'), ('', '')] Why is it wrong? The first mactch of groups: ('', 'adb') indicates the outer group ((.d.)*) captured the empty string, while the inner group (.d.) captured 'adb', so the outer group must have captured the empty string at the end of the provided string 'adb'. Once we have matched the final empty string '', there should be no more matches, but we got another match ('', '')!!! So, findall matched the empty string in the end of the string twice!!! Isn't this a bug? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
tilted text in the turtle module
First of all, I'd like to express my deep gratidute to the author of this module, it is such a fun module to work with and to teach python as a first programming language. Secondly, I would like to request a feature if it is not too hard to achieve. Currently, you can only write texts horizontally, no matter what is the current orientation of the turtle pen. I wonder if it is possible to write text in any direction when we control the heading of the turtle? For example, the following code would write a vertically oriented text: setheading(90) #turtle facing up write("vertical text!") Thanks a lot! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: a regexp riddle: re.search(r'
--- On Thu, 11/25/10, Phlip wrote: > From: Phlip > Subject: a regexp riddle: re.search(r' > To: python-list@python.org > Date: Thursday, November 25, 2010, 8:46 AM > HypoNt: > > I need to turn a human-readable list into a list(): > > print re.search(r'(?:(\w+), |and > (\w+))+', 'whatever a, bbb, and > c').groups() > > That currently returns ('c',). I'm trying to match "any > word \w+ > followed by a comma, or a final word preceded by and." > > The match returns 'a, bbb, and c', but the groups return > ('bbb', 'c'). > What do I type for .groups() to also get the 'a'? > First of all, the 'bbb' coresponds to the first capturing group and 'c' the second. But 'a' is forgotten be cause it was the first match of the first group, but there is a second match 'bbb'. Generally, a capturing group only remembers the last match. It also seems that your re may match this: 'and c', which does not seem to be your intention. So it may be more intuitively written as: r'(?:(\w+), )+and (\w+)' I'm not sure how to get it done in one step, but it would be easy to first get the whole match, then process it with: re.findall(r'(\w+)(?:,|$)', the_whole_match) cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
what a cheap rule
Sometimes the golden rule in Python of "explicit is better than implicit" is so cheap that it can be thrown away for the trouble of typing an empty tuple. Today when I am explaining that in Python 3, there are two ways to raise exceptions: raise Exception raise Exception() and that the first one is the same as the second one, as Python will add the missing pair of parenthesis. I felt their pain as they gasped. Before that, I have already explained to them this piece of code: try: raise SomeException() except SomeException: print('Got an exception here') by saying that the except-clause will match anything that belong to the SomeException class. Without knowing this secrete piece of information (that a pair of parenthesis is automatically provided), the following code would be hard to understand: try: raise SomeException except SomeException: print('Got an exception here') because the class object SomeException is not an instance of itself, so a not-so-crooked coder will not consider a match here. So, the explicit is better than implicit rule is thrown out of the window so cheaply, that it literally worth less than an empty tuple! Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
--- On Wed, 11/3/10, MRAB wrote: > [snip] > The outer group is repeated, so it can match again, but the > inner group > can't match again because it captured all it could the > previous time. > > Therefore the outer group matches and captures an empty > string and the > inner group remembers its last capture. Thanks, I got it. Basically, '(.a.)*' matched an empty string in the last outer group match, but not '(.a.)'. Now what remains hard for me to figure out is the number of matches: why is it 6 times with '((.a.)*)*' when matched to 'Mary has a lamb'? I think this is probably cuased by the limit of the matchobject: this object does not say anything about if an empty string is appended to the matched pattern or not. Hence some of the empty strings are repeated/overlapped by re.findall(). Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
--- On Wed, 11/3/10, John Bond wrote: > I just explained that (I think!)! The outer capturing group > uses > repetition, so it returns the last thing that was matched > by the inner > group, which was an empty string. > According to yourself, the last match of the inner group is also empty! Generally speaking, as a match for the outer group always contains some matches for the inner group, it must be the case that the last match for the inner group must be contained inside the last match of the outer group. So if the last match of the outer group is empty, then the last match for the inner group must also be empty. Regards, Yingjie --- On Wed, 11/3/10, John Bond wrote: > From: John Bond > Subject: Re: Must be a bug in the re module [was: Why this result with the re > module] > To: python-list@python.org > Date: Wednesday, November 3, 2010, 8:43 AM > On 3/11/2010 4:23 AM, Yingjie Lan > wrote: > > --- On Wed, 11/3/10, MRAB > wrote: > > > >> From: MRAB > >> Subject: Re: Must be a bug in the re module [was: > Why this result with the re module] > >> To: python-list@python.org > >> Date: Wednesday, November 3, 2010, 8:02 AM > >> On 03/11/2010 03:42, Yingjie Lan > >> wrote: > >> Therefore the outer (first) group is always an > empty string > >> and the > >> inner (second) group is the same as the previous > example > >> (the last > >> capture or '' if no capture). > > Now I am confused also: > > > > If the outer group \1 is empty, how could the inner > > group \2 actually have something? > > > > Yingjie > > > > > > > I just explained that (I think!)! The outer capturing group > uses > repetition, so it returns the last thing that was matched > by the inner > group, which was an empty string. I > > If you took away the outer groups repetition: > > re.findall('((.a.)*)', 'Mary has a lamb') > > then, for each of the six matches, it returns the full > thing that was > matched: > > ('Mar', 'Mar'), ('', ''), ('', ''), ('has a lam', 'lam'), > ('', ''), ('', > '')] > > Cheers, JB > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
--- On Wed, 11/3/10, John Bond wrote: >3) then said there must be >=0 occurrences of what's inside it, >which of course there is, so that has no effect. > >((.a.)*)* Hi, I think there should be a difference: unlike before, now what's inside the outer group can match an empty string. And so by reason of the greediness of the quantifier * of the outer group (that is, the last *), it should take up the empty string after each non-empty match. So, the first match in 'Mary has a lamb' must be: '' + 'Mar' + '' (the empty string before the 'y') (note the first '' is before the 'M') Then, after skipping the 'y' (remember, the empty string before 'y' is already taken), comes a second: '' (the one between 'y' and ' ') Then after skipping the space ' ', comes a third: 'has' + ' a ' + 'lam' + '' (the empty string before the 'b') And finally, it matches the empty string after 'b': '' So there should be total of four matches -- isn't it? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
--- On Wed, 11/3/10, MRAB wrote: > From: MRAB > Subject: Re: Must be a bug in the re module [was: Why this result with the re > module] > To: python-list@python.org > Date: Wednesday, November 3, 2010, 8:02 AM > On 03/11/2010 03:42, Yingjie Lan > wrote: > Therefore the outer (first) group is always an empty string > and the > inner (second) group is the same as the previous example > (the last > capture or '' if no capture). Now I am confused also: If the outer group \1 is empty, how could the inner group \2 actually have something? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
--- On Wed, 11/3/10, John Bond wrote: > Just to clarify - findall is returning: > > [ (only match in outer group, 1st match in inner group) > , (only match in outer group, 2nd match in inner group) > , (only match in outer group, 3rd match in inner group) > , (only match in outer group, 4th match in inner group) > , (only match in outer group, 5th match in inner group) > , (only match in outer group, 6th match in inner group) > ] > > Where "only match in outer group" = "6th match in inner > group" owing to the way that capturing groups with > repetition only return the last thing they matched. > ---On Wed, 11/3/10, MRAB wrote--- > Therefore the outer (first) group is always an empty string and the > inner (second) group is the same as the previous example (the last > capture or '' if no capture). OK, I've got that, and I have no problem with the capturing part. My real problem is with the number of total matches. I think it should be 4 matches in total but findall gives 6 matches, for the new regex '((.a.)*)*'. I'd love to know what you think about this. Many thanks! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
> Matches an empty string, returns '' > > The result is therefore ['Mar', '', '', 'lam', '', ''] Thanks, now I see it through with clarity. Both you and JB are right about this case. However, what if the regex is ((.a.)*)* ? -- http://mail.python.org/mailman/listinfo/python-list
Re: Must be a bug in the re module [was: Why this result with the re module]
> Your regex says "Zero or more consecutive occurrences of > something, always returning the most possible". That's > what it does, at every position - only matching emptyness > where it couldn't match anything (findall then skips a > character to avoid overlapping/infinite empty > matches), and at all other times matching the most > possible (eg. "has a lam" not "has", " a ", "lam"). You are about to convince me now. You are correct for the regex '(.a.)*'. What I thought was for this regex: '((.a.)*)*', I confused myself when I added an enclosing (). Could you please reconsider how would you work with this new one and see if my steps are correct? If you agree with my 7-step execution for the new regex, then: We finally found a real bug for re.findall: >>> re.findall('((.a.)*)*', 'Mary has a lamb') [('', 'Mar'), ('', ''), ('', ''), ('', 'lam'), ('', ''), ('', '')] Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Must be a bug in the re module [was: Why this result with the re module]
> From: John Bond > Subject: Re: Why this result with the re module > To: "Yingjie Lan" > Cc: python-list@python.org > Date: Tuesday, November 2, 2010, 8:09 PM > On 2/11/2010 12:19 PM, Yingjie Lan > wrote: > >> From: John Bond > >> Subject: Re: Why this result with the re module > > Firstly, thanks a lot for your patient explanation. > > this time I have understood all your points > perfectly. > > > > Secondly, I'd like to clarify some of my points, > which > > did not get through because of my poor presentation. > > > > I suggested findall return a tuple of > re.MatchObject(s), > > with each MatchObject instance representing a match. > > This is consistent with the re.match() function > anyway. > > And it will eliminate the need of returning tuples, > > and it is much more precise and information rich. > > > > If that's not possible, and a tuple must be returned, > > then the whole match (not just subgroups) should > > always be included as the first element in the tuple, > > as that's group(0) or '\0'. Less surprise would > arise. > > > > Finally, it seems to me the algo for findall is > WRONG. > > > > To re.findall('(.a.)*', 'Mary has a lamb'), > > by reason of greediness of '*', and the requirement > > of non-overlapping, it should go like this > > (suppose an '' is at the beginning and at the end, > > and between two consecutive characters there is > > one and only one empty string ''. To show the > > match of empty strings clearly, > > I am concatenating each repeated match below): > > > > Steps for re.findall('(.a.)*', 'Mary has a lamb'): > > > > step 1: Match '' + 'Mar' + '' (gready!) > > step 2: skip 'y' > > step 3: Match '' > > step 4: skip ' ' > > step 5: Match ''+'has'+' a '+'lam'+'' (greedy!) > > step 6: skip 'b' > > step 7: Match '' > > > > So there should be exactly 4 matches in total: > > > > 'Mar', '', 'has a lam', '' > > > > Also, the matches above shows > > that if a repeated subgroup only captures > > the last match, the subgroup (.a.)* > > should always capture '' here (see steps > > 1, 3, 5, 7) above. > > > > Yet the execution in Python results in 6 matches! > > And, the capturing subgroup with repetition > > sometimes got the wrong guy. > > > > So I believe the algorithm for findall must be WRONG. > > > > Regards, > > > > Yingjie > At a guess, I'd say what is happening is something like > this: > > Steps for re.findall('(.a.)*', 'Mary has a lamb'): > > step 1: Match 'Mar' at string index 0 > step 2: Match '' at string index 3 (before 'y') > step 3: skip 'y' > step 4: Match '' at string index 4 (before ' ') > step 5: skip ' ' > step 6: Match 'has a lam' at string index 5 > step 7: Match '' at string index 14 (before 'b') > step 8: skip 'b' > step 9: Match '' at string index 15 (before EOS) > > > > matches: ('Mar', '', '', 'has a lam', '', '') > returns: ['Mar', '', '', 'lam', '', ''] (*) > > (*) "has a " lost due to not being last repetition at that > match point > > Which seems about right to me! Greediness has nothing to do > with it, except that it causes 'has a lam' to be matched in > one match, instead of as three separate matches (of 'has', ' > a ' and 'lam'). Disagree in this case, where the whole regex matches an empty string. Greadiness will match as much as possible. So it will also match the empty strings between consecutive characters as much as possible, once we have properly defined all the unique empty strings. Because of greadiness, fewer matches should be found. In this case, it should find only 4 matches (shown in my steps) instead of 6 matches (shown in your steps). Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Why this result with the re module
> From: Vlastimil Brom > Subject: Re: Why this result with the re module > in that case you may use re.finditer(...) Thanks for pointing this out. Still I'd love to see re.findall never discards the whole match, even if a tuple is returned. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Why this result with the re module
> From: John Bond > Subject: Re: Why this result with the re module Firstly, thanks a lot for your patient explanation. this time I have understood all your points perfectly. Secondly, I'd like to clarify some of my points, which did not get through because of my poor presentation. I suggested findall return a tuple of re.MatchObject(s), with each MatchObject instance representing a match. This is consistent with the re.match() function anyway. And it will eliminate the need of returning tuples, and it is much more precise and information rich. If that's not possible, and a tuple must be returned, then the whole match (not just subgroups) should always be included as the first element in the tuple, as that's group(0) or '\0'. Less surprise would arise. Finally, it seems to me the algo for findall is WRONG. To re.findall('(.a.)*', 'Mary has a lamb'), by reason of greediness of '*', and the requirement of non-overlapping, it should go like this (suppose an '' is at the beginning and at the end, and between two consecutive characters there is one and only one empty string ''. To show the match of empty strings clearly, I am concatenating each repeated match below): Steps for re.findall('(.a.)*', 'Mary has a lamb'): step 1: Match '' + 'Mar' + '' (gready!) step 2: skip 'y' step 3: Match '' step 4: skip ' ' step 5: Match ''+'has'+' a '+'lam'+'' (greedy!) step 6: skip 'b' step 7: Match '' So there should be exactly 4 matches in total: 'Mar', '', 'has a lam', '' Also, the matches above shows that if a repeated subgroup only captures the last match, the subgroup (.a.)* should always capture '' here (see steps 1, 3, 5, 7) above. Yet the execution in Python results in 6 matches! And, the capturing subgroup with repetition sometimes got the wrong guy. So I believe the algorithm for findall must be WRONG. Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Why this result with the re module
> From: John Bond > You might wonder why something that can match no input > text, doesn't return an infinite number of those matches at > every possible position, but they would be overlapping, and > findall explicitly says matches have to be non-overlapping. That scrabbed my itches, though the notion of overlapping empty strings is quite interesting in itself. Obviously we have to assume there is one and only one empty string between two consecutive characters. Now I slightly modified my regex, and it suddenly looks self-explanatory: >>> re.findall('((.a.)+)', 'Mary has a lamb') [('Mar', 'Mar'), ('has a lam', 'lam')] >>> re.findall('((.a.)*)', 'Mary has a lamb') [('Mar', 'Mar'), ('', ''), ('', ''), ('has a lam', 'lam'), ('', ''), ('', '')] BUT, but. 1. I expected findall to find matches of the whole regex '(.a.)+', not just the subgroup (.a.) from >>> re.findall('(.a.)+', 'Mary has a lamb') Thus it is probably a misunderstanding/bug?? 2. Here is an statement from the documentation on non-capturing groups: see http://docs.python.org/dev/howto/regex.html "Except for the fact that you can’t retrieve the contents of what the group matched, a non-capturing group behaves exactly the same as a capturing group; " Thus, I'm again confused, despite of your previous explanation. This might be a better explanation: when a subgroup is repeated, it only captures the last repetition. 3. It would be convenient to have '(*...)' for non-capturing groups -- but of course, that's only a remote suggestion. 4. By reason of greediness of '*', and the concept of non-overlapping, it should go like this for re.findall('((.a.)*)', 'Mary has a lamb') step 1: Match 'Mar' + '' (gready!) step 2: skip 'y' step 3: Match '' step 4: skip ' ' step 5: Match ''+'has'+' a '+'lam'+'' (greedy!) step 7: skip 'b' step 8: Match '' So there should be 4 matches in total: 'Mar', '', 'has a lam', '' Also, if a repeated subgroup only captures the last repetition, the repeated subgroup (.a.)* should always be ''. Yet the execution in Python results in 6 matches. Here is the documentation of re.findall: findall(pattern, string, flags=0) Return a list of all non-overlapping matches in the string. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result. Thus from >>> re.findall('(.a.)*', 'Mary has a lamb') I should get this result [('',), ('',), ('',), ('',)] Finally, The name findall implies all matches should be returned, whether there are subgroups in the pattern or not. It might be best to return all the match objects (like a re.match call) instead of the matched strings. Then there is no need to return tuples of subgroups. Even if tuples of subgroups were to be returned, group(0) must also be included in the returned tuple. Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Why this result with the re module
> From: John Bond > Subject: Re: Why this result with the re module > re.findall('(.a.)*', 'Mary has a lamb') > > ['Mar', '', '', 'lam', '', ''] > So - see if you can explain the first "problematic" result > now. Thanks a lot for explaining to me the second "problematic" result! But the first one is even more puzzling...mainly because the pattern matches any empty string. Here are more examples: >>> re.findall('(.a.)*','') [''] >>> re.findall('(.a.)*',' ') #one space ['', ''] >>> re.findall('(.a.)*',' ') #two spaces ['', '', ''] >>> len(re.findall('(.a.)*',' '*4)) #four 5 >>> len(re.findall('(.a.)*',' '*8)) #eight 9 I must need more details of the matching algorithm to explain this? Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: Why this result with the re module
> From: John Bond > re.findall('(.a.)+', 'Mary has a lamb') > > ['Mar', 'lam'] > It's because you're using capturing groups, and because of > how they work - specifically they only return the LAST match > if used with repetition (and multiple matches occur). It seems capturing groups is assumed by default, but this is somehow against my intuition... Ituitively, it should be what matches the whole regex '(.a.)+', shouldn't it? Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Why this result with the re module
Hi, I am rather confused by these results below. I am not a re expert at all. the module version of re is 2.2.1 with python 3.1.2 >>> import re >>> re.findall('.a.', 'Mary has a lamb') #OK ['Mar', 'has', ' a ', 'lam'] >>> re.findall('(.a.)*', 'Mary has a lamb') #?? ['Mar', '', '', 'lam', '', ''] >>> re.findall('(.a.)+', 'Mary has a lamb') #?? ['Mar', 'lam'] Thanks in advance for any comments. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Allowing comments after the line continuation backslash
Hi, Sorry if I am baking too many ideas today. I am just having trouble with the backslashes I would like to have comments after the line continuation backslash. >>> if a > 0 \ #comments for this condition and b > 0: #do something here This is currently not OK, but this might be a good thing to have. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: with block for multiple files
> Guido's time machine strikes again! It's already in Python > 3; your > example would be spelled: > > with open('scores.csv') as f, open('grades.csv', wt) as g: > g.write(f.read()) > Indeed! Thanks, Chris and James. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Allow multiline conditions and the like
Hi, This is a mini-proposal I piggy-tailed in the other topic: Allow the conditions in the if-, elif-, while-, for-, and with-clauses to span multiple lines without using a backlalsh at the end of a line, just like when you specify literal lists, tuples, dicts, etc. across multiple lines (similar to comprehensions too). My reasons: because they all must end with a required colon ':', so nobody will mistake it. also, if we don't allow it, people just have to use parenthesis around the expressions to make that happen. Just a half-baked idea, appreciate all comments. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
with block for multiple files
Hi, Suppose I am working with two files simultaneously, it might make sense to do this: with open('scores.csv'), open('grades.csv', wt) as f,g: g.write(f.read()) sure, you can do this with nested with-blocks, but the one above does not seem too complicated, it is like having a multiple assignment... Any thoughts? Another mini-proposal: Allow the conditions in the if-, elif-, while-, for-, and with-clauses to span multiple lines without using a backlalsh, just like when you specify literal lists, tuples, dicts, etc. across multiple lines (similar to comprehensions too). My reason is this: because they all must end with a required colon ':', so nobody will mistake it. Just some half-baked ideas, would appreciate thos who shed light on these issues. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug for raw string literals in Py3k?
> According to msg56377, the behaviour is "optimal" for regular > expressions. Well, I use regular expressions a lot, and I > still think it's a nuisance! Thanks for bringing that up. Using an otherwise 'dead' backlash to escape quotes in raw strings seems like the black magic of necromancy to me. :) To include quotes in a string, there are a couple of known choices: If you need single quotes in the string, start the literal by a double-quote, and vice versa. In case you need both, you can use a long string: >>> rab\c"''' Note that when the last character is also a quote, we can use the other type of quote three times to delimit the long string. Of course, there are still some corner cases: 1. when we need three consecutive single quotes AND three consecutive double quotes in the string. 2. When the last is a single quote, and we also need three consecutive double-quotes in the string, or the other way around. Then we can abandon the raw string literal, or use concatenation of string literals wisely to get it done. But in total, I still would vote against the nacromancy. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug for raw string literals in Py3k?
> > > All backslashes in raw string literals are > interpreted literally. > > > (seehttp://docs.python.org/release/3.0.1/whatsnew/3.0.html): > > > > All backslashes in syntactically-correct raw string > literals are interpreted literally. > > That's a good way of putting it. > Syntactical correctness obviously depends on the syntax specification. To cancle the special meaning of ALL backlashes in a raw string literal makes a lot of sense to me. Currently, the behavior of backslashes in a raw string literal is rather complicated I think. In fact, the backlashes can still escape quotes in a raw string, and one the other hand, it also remains in the string -- I'm wondering what kind of use case is there to justify such a behavior? Surely, my experience is way too limited to make a solid judgement, I Hope others would shed light on this issue. Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug for raw string literals in Py3k?
> > So I suppose this is a bug? > > It's not, see > > http://docs.python.org/py3k/reference/lexical_analysis.html#literals > > # Specifically, a raw string cannot end in a single backslash Thanks! That looks weird to me ... doesn't this contradict with: All backslashes in raw string literals are interpreted literally. (see http://docs.python.org/release/3.0.1/whatsnew/3.0.html): Best, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
A bug for raw string literals in Py3k?
Hi, I tried this in the IDLE (version 3.1.2) shell: >>> r'\' SyntaxError: EOL while scanning string literal But according to the py3k docs (http://docs.python.org/release/3.0.1/whatsnew/3.0.html): All backslashes in raw string literals are interpreted literally. So I suppose this is a bug? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: please help explain this result
--- On Sun, 10/17/10, Steven D'Aprano wrote: > (1) If you assign to a variable *anywhere* in the function, > it is a local > *everywhere* in the function. > > There is no way to have a variable refer to a local in some > places of a > function and a global in other places of the same function. > This is by > design. > Super crystal clear. Thanks a lot! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: please help explain this result
> From: Nobody > The determination of local or global is made when the "def" > statement is > executed, not when the function is called. Thanks a lot for your reply, which is of great help! So, I assume that when the 'def' is executed, any name occurred will be categorized as either local or global (maybe nonlocal?). Yingjie -- http://mail.python.org/mailman/listinfo/python-list
please help explain this result
Hi, I played with an example related to namespaces/scoping. The result is a little confusing: >>> a=1 >>> def f(): a = a + 1 return a >>> f() I suppose I will get 2 ( 'a' is redefined as a local variable, whose value is obtained by the value of the global variable 'a' plus 1). But this is what I got: >>> a=1 >>> def f(): a = a + 1 return a >>> f() Traceback (most recent call last): File "", line 1, in f() File "", line 2, in f a = a + 1 UnboundLocalError: local variable 'a' referenced before assignment I'm not sure how to explain this? Thanks! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: solve alphametic puzzles in just 9 lines of code
Sorry, didn't document my code well enough. Here is the code with an example. Yingjie #Code begins### from itertools import permutations def solve(puzzle): """solve alphametic puzzles in just 9 lines of code. Make sure each operator is seperated from the words by white-spaces, e.g.: >>> solve('send + more == money') """ words = [w for w in puzzle.split() if w.isalpha()] nonzeros = {w[0] for w in words} others = {a for a in ''.join(words) if a not in nonzeros} chars = [ord(c) for c in nonzeros]+[ord(c) for c in others] assert len(chars) <= 10, 'Too many letters' for guess in permutations('0123456789', len(chars)): if '0' not in guess[:len(nonzeros)]: equation = puzzle.translate(dict(zip(chars, guess))) if eval(equation): return puzzle, equation if __name__ == '__main__': print ('\n'.join(solve("send + more == money"))) -- http://mail.python.org/mailman/listinfo/python-list
Re: sequence multiplied by -1
Hi all, Thanks for considering this proposal seriously and all your discussions shed light on the pro's and cons (well, more cons than pros, to be honest). It occurrs to me that this proposal is not a sound one, for the reasons already well documented in this thread, which I need not repeat. Thanks all for participation! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
solve alphametic puzzles in just 9 lines of code
Hi, I am teaching Python this semester and as I am trying to explain the code by Raymond Hettinger, I need to make it simpler (this is an introductory course). And it ends up to be just 9 lines of code. Just for fun. See also: http://diveintopython3.org/advanced-iterators.html Regards, Yingjie Code starts here### import itertools def solve(puzzle): "solve alphametic puzzles in just 9 lines of code." words = [w for w in puzzle.split() if w.isalpha()] nonzeros = {w[0] for w in words} others = {a for a in ''.join(words) if a not in nonzeros} chars = [ord(c) for c in nonzeros]+[ord(c) for c in others] assert len(chars) <= 10, 'Too many letters' for guess in itertools.permutations('0123456789', len(chars)): if '0' not in guess[:len(nonzeros)]: equation = puzzle.translate(dict(zip(chars, guess))) if eval(equation): return equation -- http://mail.python.org/mailman/listinfo/python-list
Re: sequence multiplied by -1
Hi, > > In my opinion this _isn't_ a situation where it's good. :) > > L[::-1] > > is only marginally longer than > > -1 * L > > I think this small gain doesn't justify "violating" this > "Python Zen" rule (from `import this`): > > There should be one-- and preferably only one > --obvious way to do it. > Thanks for the insightful remarks. For the rule above, how about the case to reverse and multiply: >>> L*-3 #L reversed and repeated three times v.s. >>> L[::-1]*3 #L reversed and repeated three times The first one is simpler (4 chars v.s. 9 chars). I thought it was also intuitive because if you multiply a vector by -1, you should get a vector in the reversed direction. But, intuitiveness depends on who you are, what you do, etc Regards, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: sequence multiplied by -1
--- On Sat, 9/25/10, Thomas Jollans wrote: > for every list l and integer n >= 0: > len(l*n) == len(l)*n Well, this invariance is indeed broken under my proposal. But it is *already broken* in current python3k. However, the following invariance is maintained under my proposal: len(l*n) == len(l) * abs(n), which is also broken under current python3k. if you think len(..) as a mathematical norm, the above invariance makes perfect sense: || a * b || == ||a|| * |b|, b is real > > > Simply put, a sequence multiplied by -1 can give a > reversed sequence. > > For that, we have slicing. A negative step value produces a > reverse slice of > the list. You can't argue that this makes sense, can you > > >>> [1,2,3,4][::-1] > [4, 3, 2, 1] > >>> [1,2,3,4][::-2] > [4, 2] > >>> Having more than one way of doing things sometimes is good. Slicing is a little more complex, what if you want this: >>> ([1,2,3,4]*2)[::-1] [4, 3, 2, 1, 4, 3, 2, 1] under my new proposal, you simply do this: >>> [1,2,3,4]*-2 [4, 3, 2, 1, 4, 3, 2, 1] Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
sequence multiplied by -1
Hi, I noticed that in python3k, multiplying a sequence by a negative integer is the same as multiplying it by 0, and the result is an empty sequence. It seems to me that there is a more meaningful symantics. Simply put, a sequence multiplied by -1 can give a reversed sequence. Then for any sequence "seq", and integer n>0, we can have "seq * -n" producing "(seq * -1) * n". Any thoughts? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: 3>0 is True
> From: Jon Siddle > Subject: Re: 3>0 is True > To: python-list@python.org > Date: Wednesday, September 15, 2010, 5:04 PM > As others have said, it's not > a matter of precendence. Using the > compiler module > you can see how python actually parses this: > > 3 > (0 is True) > Compare(Const(3), [('>', Compare(Const(0), [('is', > Name('True'))]))]) > > No great surprise there. > > 3 > 0 is True > Compare(Const(3), [('>', Const(0)), ('is', > Name('True'))]) > > As you can see, it's not the same. Two comparisons are > being done "at > once", not > one comparison on the result of another. > > Hope this helps Thank you all for nailing down this itching issue for me! All I can say is: Wow! You all have a teribly nice day! Yingjie -- http://mail.python.org/mailman/listinfo/python-list
3>0 is True
Hi, I am not sure how to interprete this, in the interactive mode: >>> 3>0 is True False >>> (3>0) is True True >>> 3> (0 is True) True Why did I get the first 'False'? I'm a little confused. Thanks in advance for anybody who shed some light on this. YL -- http://mail.python.org/mailman/listinfo/python-list
empty set and empty dict for Python 3
Hi there, Maybe somebody already suggested this: How about "{:}" for the empty dict, so that "{}" can denote the empty set? Yingjie -- http://mail.python.org/mailman/listinfo/python-list
ANN: expy 0.6.7 released!
EXPY is an express way to extend Python! EXPY provides a way to extend python in an elegant way. For more information and a tutorial, see: http://expy.sourceforge.net/ I'm glad to announce a new release again today. ^_^ What's new: Version 0.6.7 1. Now functions can have 'value on failure' via rawtype. 2. Now property getters/setters can have @throws 3. Bug fix: if __init__ wrapper fails, it must return -1. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: expy 0.6.6 released!
> Subject: ANN: expy 0.6.6 released! > To: "python list" > Cc: "CAPI Python" > Date: Monday, May 3, 2010, 3:24 AM > EXPY is an express way to extend Python! > > EXPY provides a way to extend python in an elegant way. For > more information and a tutorial, see: http://expy.sourceforge.net/ > I'm using expy in a serious project to wrap an old project written in C and deliver it up via www with django. That is why expy is getting improved quickly these days. So far, both the project and expy are making good progress hand in hand. Cheers, Yingjie -- http://mail.python.org/mailman/listinfo/python-list