Re: When are immutable tuples *essential*? Why can't you just use lists *everywhere* instead?
On Apr 20, 4:37 pm, John Machin [EMAIL PROTECTED] wrote: One inessential but very useful thing about tuples when you have a lot of them is that they are allocated the minimum possible amount of memory. OTOH lists are created with some slack so that appending etc can avoid taking quadratic time. Speaking of inessential but very useful things, I'm also a big fan of the tuple swap... a = 2 b = 3 (a, b) = (b, a) print a # 3 print b # 2 As well as the simple return of multiple values from a single function: c_stdout, c_stdin = popen2(ls) IMO, the biggest thing going for tuples is the syntactical sugar they bring to Python. Doing either of these using lists or other data constructs would not be nearly as clean as they are with tuples. -- http://mail.python.org/mailman/listinfo/python-list
Re: catching exceptions from an except: block
On Mar 7, 2:48 pm, Arnaud Delobelle [EMAIL PROTECTED] wrote: I'm not really thinking about this situation so let me clarify. Here is a simple concrete example, taking the following for the functions a,b,c I mention in my original post. - a=int - b=float - c=complex - x is a string This means I want to convert x to an int if possible, otherwise a float, otherwise a complex, otherwise raise CantDoIt. I can do: for f in int, float, complex: try: return f(x) except ValueError: continue raise CantDoIt But if the three things I want to do are not callable objects but chunks of code this method is awkward because you have to create functions simply in order to be able to loop over them (this is whay I was talking about 'abusing loop constructs'). Besides I am not happy with the other two idioms I can think of. -- Arnaud Wouldn't it be easier to do: if isinstance(x, int): # do something elif isinstance(x, float)t: # do something elif isinstance(x, complex): # do something else: raise CantDoIt or, i = [int, float, complex] for f in i: if isinstance(x, f): return x else: raise CantDoIt -- http://mail.python.org/mailman/listinfo/python-list
Re: catching exceptions from an except: block
On Mar 7, 3:04 pm, [EMAIL PROTECTED] wrote: On Mar 7, 2:48 pm, Arnaud Delobelle [EMAIL PROTECTED] wrote: I'm not really thinking about this situation so let me clarify. Here is a simple concrete example, taking the following for the functions a,b,c I mention in my original post. - a=int - b=float - c=complex - x is a string This means I want to convert x to an int if possible, otherwise a float, otherwise a complex, otherwise raise CantDoIt. I can do: for f in int, float, complex: try: return f(x) except ValueError: continue raise CantDoIt But if the three things I want to do are not callable objects but chunks of code this method is awkward because you have to create functions simply in order to be able to loop over them (this is whay I was talking about 'abusing loop constructs'). Besides I am not happy with the other two idioms I can think of. -- Arnaud Wouldn't it be easier to do: if isinstance(x, int): # do something elif isinstance(x, float)t: # do something elif isinstance(x, complex): # do something else: raise CantDoIt or, i = [int, float, complex] for f in i: if isinstance(x, f): return x else: raise CantDoIt I so missed the point of this. Not my day. Please ignore my post. -- http://mail.python.org/mailman/listinfo/python-list
Re: Regex Speed
On Feb 21, 10:34 am, [EMAIL PROTECTED] wrote: On Feb 20, 6:14 pm, Pop User [EMAIL PROTECTED] wrote: http://swtch.com/~rsc/regexp/regexp1.html Going back a bit on a tangent, the author of this citation states that any regex can be expressed as a DFA machine. However, while investigating this more I appear to have found one example of a regex which breaks this assumption. ab+c|abd Am I correct? Can you think of a deterministic method of computing this expression? It would be easier with a NFA machine, but given that the Python method of computing RE's involves pre-compiling a re object, optimizing the matching engine would make the most sense to me. Here's what I have so far: class State(object): def __init__(self): self.nextState = {} self.nextStateKeys = [] self.prevState = None self.isMatchState = True def setNextState(self, chars, iNextState): self.nextState[chars] = iNextState self.nextStateKeys = self.nextState.keys() self.isMatchState = False def setPrevState(self, iPrevState): self.prevState = iPrevState def moveToNextState(self, testChar): if testChar in self.nextStateKeys: return self.nextState[testChar] else: return None class CompiledRegex(object): def __init__(self, startState): self.startState = startState def match(self, matchStr): match_set = [] currentStates = [self.startState] nextStates = [self.startState] for character in matchStr: for state in currentStates: nextState = state.moveToNextState(character) if nextState is not None: nextStates.append(nextState) if nextState.isMatchState: print Match! return currentStates = nextStates nextStates = [self.startState] print No Match! def compile(regexStr): startState = State() currentState = startState backRefState = None lastChar = for character in regexStr: if character == +: currentState.setNextState(lastChar, currentState) elif character == |: currentState = startState elif character == ?: backRefState = currentState.prevState elif character == (: # Implement ( pass elif character == ): # Implement ) pass elif character == *: currentState = currentState.prevState currentState.setNextState(lastChar, currentState) else: testRepeatState = currentState.moveToNextState(character) if testRepeatState is None: newState = State() newState.setPrevState(currentState) currentState.setNextState(character, newState) if backRefState is not None: backRefState.setNextState(character, newState) backRefState = None currentState = newState else: currentState = testRepeatState lastChar = character return CompiledRegex(startState) a = compile(ab+c) a.match(abc) Match! a.match(abbc) Match! a.match(ac) No Match! a = compile(ab+c|abd) a.match(abc) Match! a.match(abbc) Match! a.match(ac) No Match! a.match(abd) Match! a.match(abbd) Match! -- http://mail.python.org/mailman/listinfo/python-list
Re: Regex Speed
On Feb 20, 6:14 pm, Pop User [EMAIL PROTECTED] wrote: Its very hard to beat grep depending on the nature of the regex you are searching using. The regex engines in python/perl/php/ruby have traded the speed of grep/awk for the ability to do more complex searches. http://swtch.com/~rsc/regexp/regexp1.html Some darned good reading. And it explains what happened fairly well. Thanks! And python 2.5.2. 2.5.2? Who needs crystal balls when you've got a time machine? Or did you mean 2.5? Or 1.5.2 -- say it ain't so, Joe! 2.5. I'm not entirely sure where I got that extra 2. I blame Monday. In short... avoid using re as a sledgehammer against every problem. I had a feeling that would be the case. -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a daemon process in Python
On Feb 21, 9:33 am, Eirikur Hallgrimsson [EMAIL PROTECTED] wrote: Sakagami Hiroki wrote: What is the easiest way to create a daemon process in Python? I've found it even easier to use the built in threading modules: import time t1 = time.time() print t_poc.py called at, t1 import threading def im_a_thread(): time.sleep(10) print This is your thread speaking at, time.time() thread = threading.Thread(target=im_a_thread) thread.setDaemon(True) thread.start() t2 = time.time() print Time elapsed in main thread:, t2 - t1 Of course, your mileage may vary. -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a daemon process in Python
On Feb 21, 3:34 pm, Benjamin Niemann [EMAIL PROTECTED] wrote: That's not a daemon process (which are used to execute 'background services' in UNIX environments). I had not tested this by running the script directly, and in writing a response, I found out that the entire interpreter closed when the main thread exited (killing the daemonic thread in the process). This is different behavior from running the script interactively, and thus my confusion. Thanks! ~Garrick -- http://mail.python.org/mailman/listinfo/python-list
Regex Speed
While creating a log parser for fairly large logs, we have run into an issue where the time to process was relatively unacceptable (upwards of 5 minutes for 1-2 million lines of logs). In contrast, using the Linux tool grep would complete the same search in a matter of seconds. The search we used was a regex of 6 elements ored together, with an exclusionary set of ~3 elements. Due to the size of the files, we decided to run these line by line, and due to the need of regex expressions, we could not use more traditional string find methods. We did pre-compile the regular expressions, and attempted tricks such as map to remove as much overhead as possible. With the known limitations of not being able to slurp the entire log file into memory, and the need to use regular expressions, do you have an ideas on how we might speed this up without resorting to system calls (our current solution)? -- http://mail.python.org/mailman/listinfo/python-list
Re: Regex Speed
On Feb 20, 4:15 pm, John Machin [EMAIL PROTECTED] wrote: What is an exclusionary set? It would help enormously if you were to tell us what the regex actually is. Feel free to obfuscate any proprietary constant strings, of course. My apologies. I don't have specifics right now, but it's something along the line of this: error_list = re.compile(rerror|miss|issing|inval|nvalid|math) exclusion_list = re.complie(rNo Errors Found|Premature EOF, stopping translate) for test_text in test_file: if error_list.match(test_text) and not exclusion_list.match(test_text): #Process test_text Yes, I know, these are not re expressions, but the requirements for the script specified that the error list be capable of accepting regular expressions, since these lists are configurable. I presume you mean you didn't read the whole file into memory; correct? 2 million lines doesn't sound like much to me; what is the average line length and what is the spec for the machine you are running it on? You are correct. The individual files can be anywhere from a few bytes to 2gig. The average is around one gig, and there are a number of files to be iterated over (an average of 4). I do not know the machine specs, though I can safely say it is a single core machine, sub 2.5ghz, with 2gigs of RAM running linux. map is a built-in function, not a trick. What tricks? I'm using the term tricks where I may be obfuscating the code in an effort to make it run faster. In the case of map, getting rid of the interpreted for loop overhead in favor of the implied c loop offered by map. What system calls? Do you mean running grep as a subprocess? Yes. While this may not seem evil in and of itself, we are trying to get our company to adopt Python into more widespread use. I'm guessing the limiting factor isn't python, but us python newbies missing an obvious way to speed up the process. To help you, we need either (a) basic information or (b) crystal balls. Is it possible for you to copy paste your code into a web browser or e-mail/news client? Telling us which version of Python you are running might be a good idea too. Can't copy and paste code (corp policy and all that), no crystal balls for sale, though I hope the above information helps. Also, running a trace on the program indicated that python was spending a lot of time looping around lines, checking for each element of the expression in sequence. And python 2.5.2. Thanks! -- http://mail.python.org/mailman/listinfo/python-list
Re: output to console and to multiple files
On Feb 16, 3:28 pm, Gabriel Genellina [EMAIL PROTECTED] wrote: That's ok inside the same process, but the OP needs to use it from a subprocess or spawn. You have to use something like tee, working with real file handles. I'm not particularly familiar with this, but it seems to me that if you're trying to catch stdout/stderr from a program you can call with (say) popen2, you could just read from the returned stdout/stderr pipe, and then write to a series of file handles (including sys.stdout). Or am I missing something? =) ~G -- http://mail.python.org/mailman/listinfo/python-list
Re: threading and multicores, pros and cons
On Feb 13, 9:07 pm, Maric Michaud [EMAIL PROTECTED] wrote: I've heard of a bunch of arguments to defend python's choice of GIL, but I'm not quite sure of their technical background, nor what is really important and what is not. These discussions often end in a prudent python has made a choice among others... which is not really convincing. Well, INAG (I'm not a Guru), but we recently had training from a Guru. When we brought up this question, his response was fairly simple. Paraphrased for inaccuracy: Some time back, a group did remove the GIL from the python core, and implemented locks on the core code to make it threadsafe. Well, the problem was that while it worked, the necessary locks it made single threaded code take significantly longer to execute. He then proceeded to show us how to achieve the same effect (multithreading python for use on multi-core computers) using popen2 and stdio pipes. FWIW, ~G -- http://mail.python.org/mailman/listinfo/python-list
Re: multi processes
On Feb 14, 7:53 am, amadain [EMAIL PROTECTED] wrote: Hi Heres a poser. I want to start a program 4 times at exactly the same time (emulating 4 separate users starting up the same program). I am using pexpect to run the program from 4 separate locations accross the network. How do I start the programs running at exactly the same time? I want to time how long it takes each program to complete and to show if any of the program initiations failed. I also want to check for race conditions. The program that I am running is immaterial for this question - it could be mysql running queries on the same database for example. Using threading, you call start() to start each thread but if I call start on each instance in turn I am not starting simultaneously. A Standard answers about starting anything at *exactly* the same time aside, I would expect that the easiest answer would be to have a fifth controlling program in communication with all four, which can then send a start message over sockets to each of the agents at the same time. There are several programs out there which can already do this. One example, Grinder, is designed for this very use (creating concurrent users for a test). It's free, uses Jython as it's scripting language, and even is capable of keeping track of your times for you. IMO, it's worth checking out. http://grinder.sourceforge.net -- http://mail.python.org/mailman/listinfo/python-list
Re: division by 7 efficiently ???
On Feb 1, 8:25 pm, Krypto [EMAIL PROTECTED] wrote: The correct answer as told to me by a person is (N3) + ((N-7*(N3))3) The above term always gives division by 7 Does anybody else notice that this breaks the spirit of the problem (regardless of it's accuracy)? 'N-7' uses the subtraction operator, and is thus an invalid solution for the original question. Build a recursive function, which uses two arbitrary numbers, say 1 and 100. Check each, times 7, and make sure that your target number, N, is between them. Increase or decrease your arbitrary numbers as appropriate. Now pick a random number between those two numbers, and check it. Figure out which two the answer is between, and then check a random number in that subset. Continue this, and you will drill down to the correct answer, by using only *, +, , and . I'll bet money that since this was a programming interview, that it wasn't a check of your knowledge of obscure formulas, but rather a check of your lateral thinking and knowledge of programming. ~G -- http://mail.python.org/mailman/listinfo/python-list
Re: division by 7 efficiently ???
On Feb 6, 4:54 pm, John Machin [EMAIL PROTECTED] wrote: Recursive? Bzzzt! I woudl be happy to hear your alternative, which doesn't depend on language specific tricks. Thus far, all you have suggested is using an alternative form of the division function, which I would consider to be outside the spirit of the question (though I have been wrong many times before). Might it not be better to halve the interval at each iteration instead of calling a random number function? mid = (lo + hi) 1 looks permitted and cheap to me. Also you don't run the risk of it taking a very high number of iterations to get a result. I had considered this, but to halve, you need to divide by 2. Using random, while potentially increasing the number of iterations, removes the dependency of language tricks and division. Did you notice the important word *efficiently* in line 1 of the spec? Even after ripping out recursion and random numbers, your proposed solution is still way off the pace. Again, I look forward to reading your solution. Respectfully, G. -- http://mail.python.org/mailman/listinfo/python-list