[issue22900] decimal.Context Emin, Emax limits restrict functionality without adding benefits
New submission from Jure Erznožnik: At some point since Python 2.7, the EMin, Emax members got more restrictive bounds. Emin cannot go above 0 and Emax cannot go below 0. I would argue against this logic: .prec specifies total precision .Emin and .Emax effectively limit possible locations of decimal point within the given precision. Since they don't specify / enforce EXACT position of the decimal point, what's the point of limiting them? Without restrictions, setting Emin = Emax = some positive number effectively restricts number of decimal places to exactly that positive number without a need for separate (and expensive) .quantize() calls. Removing this restriction provides an option to use decimal as true fixed-point arithmetic. -- components: Extension Modules messages: 231374 nosy: Jure.Erznožnik priority: normal severity: normal status: open title: decimal.Context Emin, Emax limits restrict functionality without adding benefits type: behavior versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22900 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Fool Python class with imaginary members (serious guru stuff inside)
I'm trying to create a class that would lie to the user that a member is in some cases a simple variable and in other cases a class. The nature of the member would depend on call syntax like so: 1. x = obj.member #x becomes the simple value contained in member 2. x = obj.member.another_member #x becomes the simple value contained in first member's another_member. So the first method detects that we only need a simple value and returns that. The second method sees that we need member as a class and returns that. Note that simple type could mean anything, from int to bitmap image. I have determined that this is possible if I sacrifice the final member reference to the __call__ override using function-call syntax: 1. x = obj.member(). The call syntax returns the simple value and the other returns the class. It is also possible if I override the __xxxitem__ methods to simulate a dictionary. However, I would like to use the true member access syntax if possible. So, is it possible? -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary self lookup
Norberto, While certainly useful, this kind of functionality contradicts the way today's string libraries work. What you are proposing isn't dict self referencing, but rather strings referencing other external data (in this case other strings from the same dict). When you write code like config = {home : /home/test} config[user1] = config[home] + /user1 config[user1] isn't stored in memory as config[home] + /user1, but as a concatenated string (/home/test/user1), composed of both those strings. The reference to original composing strings is lost at the moment the expression itself is evaluated to be inserted into the dict. There's no compiler / interpreter that would do this any other way. At least not that I know of. So best suggestion would be to simply do an object that would parse strings before returning them. In the string itself, you can have special blocks that tell your parser that they are references to other objects. You can take good old DOS syntax for that: %variable% or something more elaborate if % is used in your strings too much. Anyway, your code would then look like (one possible way): config = {home : /home/test} config[user1] = %config[home]% + /user1 or config = {home : /home/test, user1 : %config[\home\]%/user1} The parser would then just match %(something)% and replace it with actual value found in referenced variable. Eval() can help you there. Maybe there's already something in Python's libraries that matches your need. But you sure better not expect this to be included in language syntax. It's a pretty special case. Jure -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
Look, guys, here's the thing: In the company I work at we decided to rewrite our MRP system in Python. I was one of the main proponents of it since it's nicely cross platform and allows for quite rapid application development. The language and it's built in functions are simply great. The opposition was quite strong, especially since the owner cheered for it - .net. So, recently I started writing a part of this new system in Python. A report generator to be exact. Let's not go into existing offerings, they are insufficient for our needs. First I started on a few tests. I wanted to know how the reporting engine will behave if I do this or that. One of the first tests was, naturally, threading. The reporting engine itself will have separate, semi-independent parts that can be threaded well, so I wanted to test that. The rest you know if you read the two threads I started on this group. Now, the core of the new application is designed so that it can be clustered so it's no problem if we just start multiple instances on one server, say one for each available core. The other day, a coworker of mine said something like: what?!? you've been using Python for two days already and you already say it's got a major fault? I kinda aggreed with him, especially since this particular coworker programmed strictly in Python for the last 6 months (and I haven't due to other current affairs). There was no way my puny testing could reveal such a major drawback. As it turns out, I was right. I have programmed enough threading to have tried enough variations which all reveal the GIL. Which I later confirmed through searching on the web. My purpose with developing the reporting engine in Python was twofold: learn Python as I go and create a native solution which will work out- of-the-box for all systems we decide to support. Making the thing open source while I'm at it was a side-bonus. However: Since the testing revealed this, shall we say problem, I am tempted to just use plain old C++ again. Furthermore, I was also not quite content with the speed of arithmetic processing of the python engine. I created some simple aggregating objects that only performed two additions per pass. Calling them 200K times took 4 seconds. This is another reason why I'm beginning to think C++ might be a better alternative. I must admit, had the GIL issue not popped up, I'd just take the threading benefits and forget about it. But both things together, I'm thinking I need to rethink my strategy again. I may at some point decide that learning cross platform programming is worth a shot and just write a Python plugin for the code I write. The final effect will be pretty much the same, only faster. Perhaps I will even manage to get close to Crystal Reports speed, though I highly doubt that. But in the end, my Python skill will suffer. I still have an entire application (production support) to develop in it. Thanks for all the information and please don't flame each other. I already get the picture that GIL is a hot subject. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie queue question
On Jun 21, 9:43 am, Чеширский Кот p.ela...@gmail.com wrote: 1. say me dbf files count? 2. why dbf ? It was just a test. It was the most compatible format I could get between Python and the business application I work with without using SQL servers and such. Otherwise it's of no consequence. The final application will have a separate input engine that will support multiple databases as input. Jure -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
On Jun 21, 9:32 am, OdarR olivier.da...@gmail.com wrote: Do you think multiprocessing can help you seriously ? Can you benefit from multiple cpu ? did you try to enhance your code with numpy ? Olivier (installed a backported multiprocessing on his 2.5.1 Python, but need installation of Xcode first) Multithreading / multiprocessing can help me with my problem. As you know, database reading is typically I/O bound so it helps to put it in a separate thread. I might not even notice the GIL if I used SQL access in the first place. As it is, DBFPY is pretty CPU intensive since it's a pure Python DBF implementation. To continue: the second major stage (summary calculations) is completely CPU bound. Using numpy might or might not help with it. Those are simple calculations, mostly additions. I try not to put the entire database in arrays to save memory and so I mostly just add counters where I can. Soe functions simply require arrays, but they are more rare, so I guess I'm safe with that. You wouldn't believe how complex some reports can be. Threading + memory saving is a must and even so, I'll probably have to implement some sort of serialization later on, so that the stuff can run on more memory constrained devices. The third major stage, rendering engine, is again mostly CPU bound, but at the same time it's I/O bound as well when outputting the result. All three major parts are more or less independent from each other and can run simultaneously, just with a bit of a delay. I can perform calculations while waiting for the next record and I can also start rendering immediately after I have all the data for the first group available. I may use multiprocessing, but I believe it introduces more communication overhead than threads and am so reluctant to go there. Threads were perfect, other stuff wasn't. To make things worse, no particular extension / fork / branch helps me here. So if I wanted to just do the stuff in Python, I'd have to move to Jthon or IronPython and hope cPython eventually improves in this area. I do actually need cPython since the other two aren't supported on all platforms my company intends to support. The main issue I currently have with GIL is that execution time is worse when I use threading. Had it been the same, I wouldn't worry too much about it. Waiting for a permenent solution would be much easier then... -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
Add: Carl, Olivier co. - You guys know exactly what I wanted. Others: Going back to C++ isn't what I had in mind when I started initial testing for my project. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie queue question
I've done some further testing on the subject: I also added some calculations in the main loop to see what effect they would have on speed. Of course, I also added the same calculations to the single threaded functions. They were simple summary functions, like average, sum, etc. Almost no interaction with the buffers was added, just retrieval of a single field's value. Single threaded, the calculations added another 4.3 seconds to the processing time (~18%) MultiThreaded, they added 1.8 seconds. CPU usage remained below 100% of one core at all times. Made me check the process affinity. I know the main thread uses way less CPU than DBF reading thread (4 secs vs 22 secs). So I think adding these calculations should have but a minimal impact on threaded execution time. Instead, the execution time increases!!! I'm beginning to think that Python's memory management / functions introduce quite a significant overhead for threading. I think I'll just write this program in one of the compilers today to verify just how stupid I've become. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie queue question
Digging further, I found this: http://www.oreillynet.com/onlamp/blog/2005/10/does_python_have_a_concurrency.html Looking up on this info, I found this: http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock If this is correct, no amount of threading would ever help in Python since only one core / CPU could *by design* ever be utilized. Except for the code that accesses *no* functions / memory at all. This does seem to be a bit harsh though. I'm now writing a simple test program to verify this. Multiple data- independed threads just so I can see if more than one core can at all be utilized. :( -- http://mail.python.org/mailman/listinfo/python-list
Status of Python threading support (GIL removal)?
See here for introduction: http://groups.google.si/group/comp.lang.python/browse_thread/thread/370f8a1747f0fb91 Digging through my problem, I discovered Python isn't exactly thread safe and to solve the issue, there's this Global Interpreter Lock (GIL) in place. Effectively, this causes the interpreter to utilize one core when threading is not used and .95 of a core when threading is utilized. Is there any work in progess on core Python modules that will permanently resolve this issue? Is there any other way to work around the issue aside from forking new processes or using something else? -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
Thanks guys, for all the replies. They were some very interesting reading / watching. Seems to me, the Unladen-Swallow might in time produce code which will have this problem lessened a bit. Their roadmap suggests at least modifying the GIL principles if not fully removing it. On top of this, they seem to have a pretty aggressive schedule with good results expected by Q3 this year. I'm hoping that their patches will be accepted to cPython codebase in a timely manner. I definitely liket the speed improvements they showed for Q1 modifications. Though those improvements don't help my case yet... The presentation from mr. Beasley was hilarious :D I find it curious to learn that just simple replacement from events to actual mutexes already lessens the problem a lot. This should already be implemented in the cPython codebase IMHO. As for multiprocessing alternatives, I'll have to look into them. I haven't yet done multiprocessing code and don't really know what will happen when I try. I believe that threads would be much more appropriate for my project, but it's definitely worth a shot. Since my project is supposed to be cross platform, I'm not really looking forward to learning cross platform for C++. All my C++ experience is DOS + Windows derivatives till now :( -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
Sorry, just a few more thoughts: Does anybody know why GIL can't be made more atomic? I mean, use different locks for different parts of code? This way there would be way less blocking and the plugin interface could remain the same (the interpreter would know what lock it used for the plugin, so the actual function for releasing / reacquiring the lock could remain the same) On second thought, forget this. This is probably exactly the cause of free-threading reduced performance. Fine-graining the locks increased the lock count and their implementation is rather slow per se. Strange that *nix variants don't have InterlockedExchange, probably because they aren't x86 specific. I find it strange that other architectures wouldn't have these instructions though... Also, an OS should still be able to support such a function even if underlying architecture doesn't have it. After all, a kernel knows what it's currently running and they are typically not preempted themselves. Also, a beside question: why does python so like to use events instead of true synchronization objects? Almost every library I looked at used that. IMHO that's quite irrational. Using objects that are intended for something else for the job while there are plenty of true options supported in every OS out there. Still, the free-threading mod could still work just fine if there was just one more global variable added: current python thread count. A simple check for value greater than 1 would trigger the synchronization code, while having just one thread would introduce no locking at all. Still, I didn't like the performance figures of the mod (0.6 execution speed, pretty bad core / processor scaling) I don't know why it's so hard to do simple locking just for writes to globals. I used to do it massively and it always worked almost with no penalty at all. It's true that those were all Windows programs, using critical sections. -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
On Jun 19, 11:45 pm, OdarR olivier.da...@gmail.com wrote: On 19 juin, 21:05, Christian Heimes li...@cheimes.de wrote: I've seen a single Python process using the full capacity of up to 8 CPUs. The application is making heavy use of lxml for large XSL transformations, a database adapter and my own image processing library based upon FreeImage. interesting... Of course both lxml and my library are written with the GIL in mind. They release the GIL around every call to C libraries that don't touch Python objects. PIL releases the lock around ops as well (although it took me a while to figure it out because PIL uses its own API instead of the standard macros). reportlab has some optional C libraries that increase the speed, too. Are you using them? I don't. Or maybe I did, but I have no clue what to test. Do you have a real example, some code snippet to can prove/show activity on multiple core ? I accept your explanation, but I also like experiencing :) By the way threads are evil (http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf) and not *the* answer to concurrency. I don't see threads as evil from my little experience on the subject, but we need them. I'm reading what's happening in the java world too, it can be interesting. Olivier Olivier, What Christian is saying is that you can write a C/C++ Python plugin, release the GIL inside it and then process stuff in threads inside the plugin. All this is possible if the progammer doesn't use any Python objects and it's fairly easy to write such a plugin. Any counting example will do just fine. The problem with this solution is that you have to write the code in C which quite defeats the purpose of using an interpreter in the first place... Of course, no pure python code will currently utilize multiple cores (because of GIL). I do aggree though that threading is important. Regardless of any studies showing that threads suck, they are here and they offer relatively simple concurrency. IMHO they should never have been crippled like this. Even though GIL solves access violations, it's not the right approach. It simply kills all threading benefits except for the situation where you work with multiple I/O blocking threads. That's just about the only situation where this problem is not apparent. We're way past single processor single core computers now. An important product like Python should support these architectures properly even if only 1% of applications written in it use threading. But as Guido himself said; I should not complain but instead try to contribute to solution. That's the hard part, especially since there's lots of code that actually need the locking. -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
On Jun 19, 11:59 pm, Jesse Noller jnol...@gmail.com wrote: On Fri, Jun 19, 2009 at 12:50 PM, OdarRolivier.da...@gmail.com wrote: On 19 juin, 16:16, Martin von Loewis martin.vonloe...@hpi.uni-: If you know that your (C) code is thread safe on its own, you can release the GIL around long-running algorithms, thus using as many CPUs as you have available, in a single process. what do you mean ? Cpython can't benefit from multi-core without multiple processes. Olivier Sorry, you're incorrect. I/O Bound threads do in fact, take advantage of multiple cores. Incorrect. They take advantage of OS threading support where another thread can run while one is blocked for I/O. That is not equal to running on multiple cores (though it actually does do that, just that cores are all not well utilized - sum(x) 100% of one core). You wil get better performance running on single core because of the way GIL is implemented in all cases. -- http://mail.python.org/mailman/listinfo/python-list
Re: Status of Python threading support (GIL removal)?
On Jun 20, 1:36 am, a...@pythoncraft.com (Aahz) wrote: You should put up or shut up -- I've certainly seen multi-core speedup with threaded software, so show us your benchmarks! -- Sorry, no intent to offend anyone here. Flame wars are not my thing. I have shown my benchmarks. See first post and click on the link. That's the reason I started this discussion. All I'm saying is that you can get threading benefit, but only if the threading in question is implemented in C plugin. I have yet to see pure Python code which does take advantage of multiple cores. From what I read about GIL, this is simply impossible by design. But I'm not disputing the fact that cPython as a whole can take advantage of multiple cores. There certainly are built-in objects that work as they should. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie queue question
Thanks for the suggestions. I've been looking at the source code of threading support objects and I saw that non-blocking requests in queues use events, while blocking requests just use InterlockedExchange. So plain old put/get is much faster and I've managed to confirm this today with further testing. Sorry about the semicolon, just can't seem to shake it with my pascal C++ background :) Currently, I've managed to get the code to this stage: class mt(threading.Thread): q = Queue.Queue() def run(self): dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1) for i1 in xrange(len(dbf1)): self.q.put(dbf1[i1]) dbf1.close() del dbf1 self.q.put(None) t = mt() t.start() time.sleep(22) rec = 1 while rec None: rec = t.q.get() del t Note the time.sleep(22). It takes about 22 seconds to read the DBF with the 200K records (71MB). It's entirely in cache, yes. So, If I put this sleep in there, the whole procedure finishes in 22 seconds with 100% CPU (core) usage. Almost as fast as the single threaded procedure. There is very little overhead. When I remove the sleep, the procedure finishes in 30 seconds with ~80% CPU (core) usage. So the threading overhead only happens when I actually cause thread interaction. This never happened to me before. Usually (C, Pascal) there was some threading overhead, but I could always measure it in tenths of a percent. In this case it's 50% and I'm pretty sure InterlockedExchange is the fastest thing there can be. My example currently really is a dummy one. It doesn't do much, only the reading thread is implemented, but that will change with time. Reading the data source is one task, I will proceed with calculations and with a rendering engine, both of which will be pretty CPU intensive as well. I'd like to at least make the reading part behave like I want it to before I proceed. It's clear to me I don't understand Python's threading concepts yet. I'd still appreciate further advice on what to do to make this sample work with less overhead. -- http://mail.python.org/mailman/listinfo/python-list
Newbie queue question
Hi, I'm pretty new to Python (2.6) and I've run into a problem I just can't seem to solve. I'm using dbfpy to access DBF tables as part of a little test project. I've programmed two separate functions, one that reads the DBF in main thread and the other which reads the DBF asynchronously in a separate thread. Here's the code: def demo_01(): '''DBF read speed only''' dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1) for i1 in xrange(len(dbf1)): rec = dbf1[i1] dbf1.close() def demo_03(): '''DBF read speed into a FIFO queue''' class mt(threading.Thread): q = Queue.Queue(64) def run(self): dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1) for i1 in xrange(len(dbf1)): self.q.put(dbf1[i1]) dbf1.close() del dbf1 self.q.join() t = mt() t.start() while t.isAlive(): try: rec = t.q.get(False, 0.2) t.q.task_done(); except: pass del t However I'm having serious issues with the second method. It seems that as soon as I start accessing the queue from both threads, the reading speed effectively halves. I have tried the following: 1. using deque instead of queue (same speed) 2. reading 10 records at a time and inserting them in a separate loop (hoped the congestion would help) 3. Increasing queue size to infinite and waiting 10 seconds in main thread before I started reading - this one yielded full reading speed, but the waiting took away all the threading benefits I'm sure I'm doing something very wrong here, I just can't figure out what. Can anyone help me with this? Thanks, Jure -- http://mail.python.org/mailman/listinfo/python-list