Re: Python Go
Terry Reedy wrote: It seems to me that generators are already 'channels' that connect the calling code to the __next__ method, a semi-coroutine based on the body of the generator function. At present, the next method waits until an object is requested. Then it goes into action, yields an object, and rests again. For parallel operations, we need eager, anticipatory evaluation that produces things that *will* be needed rather than lazy evaluation of things that *are* needed and whose absence is holding up everything else. Yes, generators look very much like channels. The obvious thing, from where I'm sitting, is to have a function called channel that takes an iterator, runs it in a different thread/process/goroutine, and returns an iterator that reads from the channel. A single threaded version would look very much like iter so let's use iter to get a working example: #!/usr/bin/python2 -u channel = iter # placeholder for missing feature def generate(): i = 2 while True: yield i i += 1 def filter(input, prime): for i in input: if i%prime != 0: yield i ch = channel(generate()) try: while True: prime = ch.next() print prime ch = channel(filter(ch, prime)) except IOError: pass That works fine in a single thread. It's close to the original go example, hence the evil shadowing of a builtin. I don't think the channel function would present any problems given an appropriate library to wrap. I got something like this working with Jython and the E language but, as I recall, had an accident and lost the code. If somebody wants to implement it using multiprocessing, go to it! Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of Python immutability
John Nagle wrote: In the beginning, strings, tuples, and numbers were immutable, and everything else was mutable. That was simple enough. But over time, Python has acquired more immutable types - immutable sets and immutable byte arrays. Each of these is a special case. snip Immutability is interesting for threaded programs, because immutable objects can be shared without risk. Consider a programming model where objects shared between threads must be either immutable or synchronized in the sense that Java uses the term. Such programs are free of most race conditions, without much programmer effort to make them so. Of course, tuples would still be a special case because they may contain mutable objects. You need to check they're immutable all the way down. Nothing to do with threading, but it's also the cause of this weirdness: http://bytes.com/topic/python/answers/752154-list-tuple a = ([1], 2) a[0] += [3] succeeds, but raises an error. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular Expression Help
Jean-Claude Neveu wrote: Hello, I was wondering if someone could tell me where I'm going wrong with my regular expression. I'm trying to write a regexp that identifies whether a string contains a correctly-formatted currency amount. I want to support dollars, UK pounds and Euros, but the example below deliberately omits Euros in case the Euro symbol get mangled anywhere in email or listserver processing. I also want people to be able to omit the currency symbol if they wish. If Euro symbols can get mangled, so can Pound signs. They're both outside ASCII. My regexp that I'm matching against is: ^\$\£?\d{0,10}(\.\d{2})?$ Here's how I think it should work (but clearly I'm wrong, because it does not actually work): ^\$\£? Require zero or one instance of $ or £ at the start of the string. ^[$£]? is correct. And, as you're using re.match, the ^ is superfluous. (A previous message suggested ^[\$£]? which will also work. You generally need to escape a Dollar sign but not here.) You should also think about the encoding. In my terminal, £ is identical to '\xc2\xa3'. That is, two bytes for a UTF-8 code point. If you assume this encoding, it's best to make it explicit. And if you don't assume a specific encoding it's best to convert to unicode to do the comparisons, so for 2.x (or portability) your string should start u d{0,10} Next, require between zero and ten alpha characters. There's a backslash missing, but not from your original expression. Digits are not alpha characters. (\.\d{2})? Optionally, two characters can follow. They must be preceded by a decimal point. That works. Of course, \d{2} is longer than the simpler \d\d Note that you can comment the original expression like this: rex = u(?x) ^[$£]?# Zero or one instance of $ or £ # at the start of the string. \d{0,10} # Between zero and ten digits (\.\d{2})? # Optionally, two digits. # They must be preceded by a decimal point. $ # End of line Then anybody (including you) who comes to read this in the future will have some idea what you were trying to do. \ Examples of acceptable input should be: $12.42 $12 £12.42 $12,482.96 (now I think about it, I have not catered for this in my regexp) Yes, you need to think about that. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Correct URL encoding
mattia wrote: I'm using urlopen in order to download some web pages. I've always to replace some characters that are in the url, so I've come up with: url.replace(|, %7C).replace(/, %2F).replace( , +).replace (:, %3A) There isn't a better way of doing this? Yeah, shame there's no function -- called urlencode say -- that does it all for you. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Get bound method by name
Johannes Bauer wrote: Hello group, I'm looking for a Python function but have forgotten it's name. Essentially what I want is: class Foo(): def bar(self): pass x = Foo() y = x.MAGIC(bar) print(y) bound method Foo.bar of __main__.Foo instance at 0xb7e11fcc So the question is: How is the magic function called which returns me the bound method of a class instance by its name? I know there was a way but just can't remember... y = getattr(x, bar) -- http://mail.python.org/mailman/listinfo/python-list
Re: Is using range() in for loops really Pythonic?
George Sakkis wrote: If you push this logic too far, you should del every name immediately after the last statement it is used in the scope. I would generally find less readable some code spread with del every few lines, micro- managing the effective scope of each name. YMMV. Yes, but ... how about for i in range(10): del i do stuff ? It makes it clear you aren't using the index and ensures you get a run-time error if you clobbered an existing variable. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.5 adoption
On Apr 19, 3:16 am, Joseph Turian [EMAIL PROTECTED] wrote: Basically, we're planning on releasing it as open-source, and don't want to alienate a large percentage of potential users. How about Java users? Jython was recently at 2.2 (still is for all I know). I'm pleased they've got that far because I like to know that my code can run under Java and I like generators. My web host uses 1.5.2. That is painful. If you're assuming your potential users already have 2.4 then the chances are they'll have upgraded to 2.5 by the time you've finished anyway. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Rounding a number to nearest even
On Apr 11, 6:14 pm, bdsatish [EMAIL PROTECTED] wrote: The built-in function round( ) will always round up, that is 1.5 is rounded to 2.0 and 2.5 is rounded to 3.0. If I want to round to the nearest even, that is my_round(1.5) = 2# As expected my_round(2.5) = 2# Not 3, which is an odd num If you care about such details, you may be better off using decimals instead of floats. I'm interested in rounding numbers of the form x.5 depending upon whether x is odd or even. Any idea about how to implement it ? import decimal decimal.Decimal(1.5).to_integral( rounding=decimal.ROUND_HALF_EVEN) decimal.Decimal(2.5).to_integral( rounding=decimal.ROUND_HALF_EVEN) ROUND_HALF_EVEN is the default, but maybe that can be changed, so explicit is safest. If you really insist, import decimal def my_round(f): d = decimal.Decimal(str(f)) rounded = d.to_integral(rounding=decimal.ROUND_HALF_EVEN) return int(rounded) Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's only one way to do it philosophy isn't good?
Dennis Lee Bieber wote: But if these macros are supposed to allow one to sort of extend Python syntax, are you really going to code things like macrolib1.keyword everywhere? I don't see why that *shouldn't* work. Or from macrolib1 import keyword as foo. And to be truly Pythonic the keywords would have to be scoped like normal Python variables. One problem is that such a system wouldn't be able to redefine existing keywords. Lets wait for a concrete proposal before delving into this rats' cauldron any further. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's only one way to do it philosophy isn't good?
Douglas Alan wote: Graham Breed [EMAIL PROTECTED] writes: Another way is to decorate functions with their local variables: from strict import my @my(item) ... def f(x=1, y=2.5, z=[1,2,4]): ... x = float(x) ... w = float(y) ... return [item+x-y for item in z] Well, I suppose that's a bit better than the previous suggestion, but (1) it breaks the style rule of not declaring variables until you need them, and (2) it doesn't catch double initialization. (1) is a style rule that many style guides explicitly violate. What is (2) and why would it be a problem? A better way that I think is fine syntactically would be from strict import norebind, set @norebind def f(x=1, y=2.5, z=[1.2.4]): set(x=float(x)) set(w=float(y)) return [item+x-y for item in z] It won't work because the Python semantics don't allow a function to alter a nested namespace. Or for a decorator to get at the locals of the function it's decorating. It's an example of Python restricting flexibility, certainly. The best way to catch false rebindings is to stick a comment with the word rebound after every statement where you think you're rebinding a variable. No, the best way to catch false rebindings is to have the computers catch such errors for you. That's what you pay them for. How does the computer know which rebindings are false unless you tell it? Then you can search your code for cases where there's a rebound comment but no rebinding. And how do I easily do that? And how do I know if I even need to in the face of sometimes subtle bugs? In UNIX, you do it by putting this line in a batch file: egrep -H 'rebound' $* | egrep -v '^[^:]+:[[:space:]]*([.[:alnum:]]+) [[:space:]]*=(|.*[^.])\\1\' You don't know you need to do it, of course. Like you wouldn't know you needed to use the let and set macros if that were possible. Automated checks are only useful for problems you know you might have. Assuming you're the kind of person who knows that false rebindings can lead to perplexing bugs, but doesn't check apparent rebindings in a paranoid way every time a perplexing bug comes up, anyway. (They aren't that common in modern python code, after all.) They're not that uncommon, either. The 300-odd line file I happened to have open had no examples of the form x = f(x). There was one rebinding of an argument, such as: if something is None: something = default_value but that's not the case you were worried about. If you've decided it does worry you after all there may be a decorator/function pattern that can check that no new variables have been declared up to a certain point. I also checked a 400-odd file which has one rebinding that the search caught. And also this line: m, n = n, m%n which isn't of the form I was searching for. Neither would the set() solution above be valid, or the substitution below. I'm sure it can be done with regular expressions, but they'd get complicated. The best way would be to use a parser, but unfortunately I don't understand the current Python grammar for assignments. I'd certainly be interested to see how your proposed macros would handle this kind of thing. This is important because the Python syntax is complicated enough that you have to be careful playing around with it. Getting macros to work the way you want with results acceptable to the general community looks like a huge viper pit to me. That may be why you're being so vague about the implementation, and why no macro advocates have managed to get a PEP together. A preprocessor that can read in modified Python syntax and output some form of real Python might do what you want. It's something you could work on as a third-party extension and it should be able to do anything macros can. That aside, the short code sample I give below does have a rebinding of exactly the form you were worried about. It's still idiomatic for text substitutions and so code with a lot of text substitutions will likely have a lot of rebindings. You could give each substituted text a different name. I think that makes some sense because if you're changing the text you should give it a name to reflect the changes. But it's still error prone: you might use the wrong (valid) name subsequently. Better is to check for unused variables. I've certainly had it happen to me on several occasions, and sometimes they've been hard to find as I might not even see the mispeling even if I read the code 20 times. With vim, all you have to do is go to the relevant line and type ^* to check that the two names are really the same. I see you use Emacs but I'm sure that has an equivalent. (Like the time I spent all day trying to figure out why my assembly code wasn't working when I was a student and finally I decided to ask the TA for help, and while talking him through my code so that he could tell me what I was doing wrong, I finally noticed the rO where there was supposed
Re: Python's only one way to do it philosophy isn't good?
Steven D'Aprano wote: But if you really want declarations, you can have them. import variables variables.declare(x=1, y=2.5, z=[1, 2, 4]) variables.x = None variables.w = 0 Traceback (most recent call last): File stdin, line 1, in module File variables.py, line 15, in __setattr__ raise self.DeclarationError(Variable '%s' not declared % name) variables.DeclarationError: Variable 'w' not declared Another way is to decorate functions with their local variables: from strict import my @my(item) ... def f(x=1, y=2.5, z=[1,2,4]): ... x = float(x) ... w = float(y) ... return [item+x-y for item in z] ... Traceback (most recent call last): File stdin, line 2, in module File strict.py, line 11, in dec raise DeclarationError(No slot for %s%varname) strict.DeclarationError: No slot for w and the implementation import re class DeclarationError(TypeError): pass def my(slots=): tokens = slots.split() def dec(func): code = func.func_code for varname in code.co_varnames[code.co_argcount:]: if re.match('\w+$', varname) and varname not in tokens: raise DeclarationError(No slot for %s%varname) return func return dec The best way to catch false rebindings is to stick a comment with the word rebound after every statement where you think you're rebinding a variable. Then you can search your code for cases where there's a rebound comment but no rebinding. Assuming you're the kind of person who knows that false rebindings can lead to perplexing bugs, but doesn't check apparent rebindings in a paranoid way every time a perplexing bug comes up, anyway. (They aren't that common in modern python code, after all.) And that you remembered to add the comments (like you would have remembered the let and set). And you're also the kind of person who's troubled by perplexing bugs but doesn't run a fully fledged lint. Maybe that's the kind of person who wouldn't put up with anything short of a macro as in the original proposal. All I know is that it's the kind of person I don't want to second guess. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Trivial string substitution/parser
Duncan Booth wote: If you must insist on using backslash escapes (which introduces the question of how you get backslashes into the output: do they have to be escaped as well?) then use string.Template with a custom pattern. If anybody wants this, I worked out the following regular expression which seems to work: (?Pescaped\\)\$ | #backslash escape pattern \$(?: (?Pnamed[_a-z][_a-z0-9]*)| # delimiter and Python identifier {(?Pbraced[_a-z][_a-z0-9]*)} | # delimiter and braced identifier (?Pinvalid) # Other ill-formed delimiter exprs ) The clue is string.Template.pattern.pattern So you compile that with verbose and case-insensitive flags and set it to pattern in a string.Template subclass. (In fact you don't have to compile it, but that behaviour's undocumented.) Something like regexp = ... (?Pescaped)\\$ | # backslash escape pattern ... \$(?: ... (?Pnamed[_a-z][_a-z0-9]*)| # delimiter and identifier ... {(?Pbraced[_a-z][_a-z0-9]*)} | # ... and braced identifier ... (?Pinvalid) # Other ill-formed delimiter exprs ... ) ... class BackslashEscape(Template): ... pattern = re.compile(regexp, re.I | re.X) ... Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Trivial string substitution/parser
Samuel wote: Thanks, however, turns out my specification of the problem was incomplete: In addition, the variable names are not known at compilation time. I just did it that way, this looks fairly easy already: --- import re def variable_sub_cb(match): prepend = match.group(1) varname = match.group(2) value = get_variable(varname) return prepend + value string_re = re.compile(r'(^|[^\\])\$([a-z][\w_]+\b)', re.I) input = r'In this string $variable1 is substituted,' input += 'while \$variable2 is not.' print string_re.sub(variable_sub_cb, input) --- It gets easier: import re def variable_sub_cb(match): return get_variable(match.group(1)) string_re = re.compile(r'(?!\\)\$([A-Za-z]\w+)') def get_variable(varname): return globals()[varname] variable1 = 'variable 1' input = r'In this string $variable1 is substituted,' input += 'while \$variable2 is not.' print string_re.sub(variable_sub_cb, input) or even import re def variable_sub_cb(match): return globals()[match.group(1)] variable1 = 'variable 1' input = (r'In this string $variable1 is substituted,' 'while \$variable2 is not.') print re.sub(r'(?!\\)\$([A-Za-z]\w+)', variable_sub_cb, input) Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Trivial string substitution/parser
Duncan Booth wote: Also, of course, vars just needs to be something which quacks like a dict: it can do whatever it needs to do such as looking up a database or querying a server to generate the value only when it needs it, or even evaluating the name as an expression; in the OP's case it could call get_variable. And in case that sounds difficult, the code is class VariableGetter: def __getitem__(self, key): return get_variable(key) Anyway, the question seems to be moot since the OP's definition of 'elegant and lazy' includes regular expressions and reinvented wheels. Your suggestion of subclassing string.Template will also require a regular expression -- and a fairly hairy one as far as I can work out from the documentation. There isn't an example and I don't think it's the easiest way of solving this problem. But if Samuel really wants backslash escaping it'd be easier to do a replace('$$','') and replace('\\$', '$$') (or replace('\\$','\\$$') if he really wants the backslash to persist) before using the template. Then, if he really does want to reject single letter variable names, or names beginning with a backslash, he'll still need to subclass Template and supply a regular expression, but a simpler one. ... and in another message Graham Breed wrote: def get_variable(varname): return globals()[varname] Doesn't the mere thought of creating global variables with unknown names make you shudder? Not at all. It works, it's what the shell does, and it's easy to test interactively. Obviously the application code wouldn't look like that. Graham -- http://mail.python.org/mailman/listinfo/python-list
Re: Pyrex speed
Jim Lewis \/\/|20+3: I'm not planning to write C functions. My understanding is that by using cdefs in the python code one can gain substantial speed. I'm trying to find a description of how to modify python code in more detail so it runs fast under pyrex. I've used pyrex to speed up my code. It worked. While it isn't intended as a tutorial on pyrex you can have a look at it here: http://www.microtonal.co.uk/temper.html The trick is to write C functions using pyrex. That's not much easier than writing C functions in C. But I still found it convenient enough to be worth doing that way. Some tips: - declare functions with cdef - declare the type of every variable you use - don't use Python builtins, or other libraries The point of these rules is that generated C code using Python variables will still be slow. You want Pyrex to write C code using C variables only. To check this is happening you can look at the automatically generated source code to make sure there are no reference counting functions where there shouldn't be. The usual rule for C optimization applies -- rewrite the code that you're spending most time in. But if that innermost function's being called from a loop it can be worth changing the loop as well so that you pass in and out C variables. HTH, Graham -- http://mail.python.org/mailman/listinfo/python-list