[issue22900] decimal.Context Emin, Emax limits restrict functionality without adding benefits

2014-11-19 Thread Jure Erznožnik

New submission from Jure Erznožnik:

At some point since Python 2.7, the EMin, Emax members got more restrictive 
bounds. Emin cannot go above 0 and Emax cannot go below 0.

I would argue against this logic:
.prec specifies total precision
.Emin and .Emax effectively limit possible locations of decimal point within 
the given precision. Since they don't specify / enforce EXACT position of the 
decimal point, what's the point of limiting them?

Without restrictions, setting Emin = Emax = some positive number effectively 
restricts number of decimal places to exactly that positive number without a 
need for separate (and expensive) .quantize() calls.

Removing this restriction provides an option to use decimal as true fixed-point 
arithmetic.

--
components: Extension Modules
messages: 231374
nosy: Jure.Erznožnik
priority: normal
severity: normal
status: open
title: decimal.Context Emin, Emax limits restrict functionality without adding 
benefits
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22900
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Fool Python class with imaginary members (serious guru stuff inside)

2012-09-20 Thread Jure Erznožnik
I'm trying to create a class that would lie to the user that a member is in 
some cases a simple variable and in other cases a class. The nature of the 
member would depend on call syntax like so:
1. x = obj.member #x becomes the simple value contained in member
2. x = obj.member.another_member #x becomes the simple value contained in 
first member's another_member.

So the first method detects that we only need a simple value and returns 
that. The second method sees that we need member as a class and returns 
that. Note that simple type could mean anything, from int to bitmap image.

I have determined that this is possible if I sacrifice the final member 
reference to the __call__ override using function-call syntax: 1. x = 
obj.member(). The call syntax returns the simple value and the other returns 
the class. It is also possible if I override the __xxxitem__ methods to 
simulate a dictionary.

However, I would like to use the true member access syntax if possible.

So, is it possible?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dictionary self lookup

2009-06-27 Thread Jure Erznožnik
Norberto,

While certainly useful, this kind of functionality contradicts the way
today's string libraries work.
What you are proposing isn't dict self referencing, but rather strings
referencing other external data (in this case other strings from the
same dict).

When you write code like
config = {home : /home/test}
config[user1] = config[home] + /user1

config[user1] isn't stored in memory as config[home] + /user1,
but as a concatenated string (/home/test/user1), composed of both
those strings. The reference to original composing strings is lost at
the moment the expression itself is evaluated to be inserted into the
dict.
There's no compiler / interpreter that would do this any other way. At
least not that I know of.

So best suggestion would be to simply do an object that would parse
strings before returning them. In the string itself, you can have
special blocks that tell your parser that they are references to other
objects. You can take good old DOS syntax for that: %variable% or
something more elaborate if % is used in your strings too much.

Anyway, your code would then look like (one possible way):
config = {home : /home/test}
config[user1] = %config[home]% + /user1

or

config = {home : /home/test, user1 : %config[\home\]%/user1}

The parser would then just match %(something)% and replace it with
actual value found in referenced variable. Eval() can help you there.
Maybe there's already something in Python's libraries that matches
your need.

But you sure better not expect this to be included in language syntax.
It's a pretty special case.

Jure
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-21 Thread Jure Erznožnik
Look, guys, here's the thing:
In the company I work at we decided to rewrite our MRP system in
Python. I was one of the main proponents of it since it's nicely cross
platform and allows for quite rapid application development. The
language and it's built in functions are simply great. The opposition
was quite strong, especially since the owner cheered for it - .net.

So, recently I started writing a part of this new system in Python. A
report generator to be exact. Let's not go into existing offerings,
they are insufficient for our needs.

First I started on a few tests. I wanted to know how the reporting
engine will behave if I do this or that. One of the first tests was,
naturally, threading. The reporting engine itself will have separate,
semi-independent parts that can be threaded well, so I wanted to test
that.

The rest you know if you read the two threads I started on this group.

Now, the core of the new application is designed so that it can be
clustered so it's no problem if we just start multiple instances on
one server, say one for each available core.

The other day, a coworker of mine said something like: what?!? you've
been using Python for two days already and you already say it's got
a major fault?
I kinda aggreed with him, especially since this particular coworker
programmed strictly in Python for the last 6 months (and I haven't due
to other current affairs). There was no way my puny testing could
reveal such a major drawback. As it turns out, I was right. I have
programmed enough threading to have tried enough variations which all
reveal the GIL. Which I later confirmed through searching on the web.

My purpose with developing the reporting engine in Python was twofold:
learn Python as I go and create a native solution which will work out-
of-the-box for all systems we decide to support. Making the thing open
source while I'm at it was a side-bonus.

However:
Since the testing revealed this, shall we say problem, I am tempted
to just use plain old C++ again. Furthermore, I was also not quite
content with the speed of arithmetic processing of the python engine.
I created some simple aggregating objects that only performed two
additions per pass. Calling them 200K times took 4 seconds. This is
another reason why I'm beginning to think C++ might be a better
alternative. I must admit, had the GIL issue not popped up, I'd just
take the threading benefits and forget about it.

But both things together, I'm thinking I need to rethink my strategy
again.
I may at some point decide that learning cross platform programming is
worth a shot and just write a Python plugin for the code I write. The
final effect will be pretty much the same, only faster. Perhaps I will
even manage to get close to Crystal Reports speed, though I highly
doubt that. But in the end, my Python skill will suffer. I still have
an entire application (production support) to develop in it.

Thanks for all the information and please don't flame each other.
I already get the picture that GIL is a hot subject.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie queue question

2009-06-21 Thread Jure Erznožnik
On Jun 21, 9:43 am, Чеширский Кот p.ela...@gmail.com wrote:
 1. say me dbf files count?
 2. why dbf ?

It was just a test. It was the most compatible format I could get
between Python and the business application I work with without using
SQL servers and such.
Otherwise it's of no consequence. The final application will have a
separate input engine that will support multiple databases as input.

Jure
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-21 Thread Jure Erznožnik
On Jun 21, 9:32 am, OdarR olivier.da...@gmail.com wrote:

 Do you think multiprocessing can help you seriously ?
 Can you benefit from multiple cpu ?

 did you try to enhance your code with numpy ?

 Olivier
 (installed a backported multiprocessing on his 2.5.1 Python, but need
 installation of Xcode first)

Multithreading / multiprocessing can help me with my problem. As you
know, database reading is typically I/O bound so it helps to put it in
a separate thread. I might not even notice the GIL if I used SQL
access in the first place. As it is, DBFPY is pretty CPU intensive
since it's a pure Python DBF implementation.
To continue: the second major stage (summary calculations) is
completely CPU bound. Using numpy might or might not help with it.
Those are simple calculations, mostly additions. I try not to put the
entire database in arrays to save memory and so I mostly just add
counters where I can. Soe functions simply require arrays, but they
are more rare, so I guess I'm safe with that. You wouldn't believe how
complex some reports can be. Threading + memory saving is a must and
even so, I'll probably have to implement some sort of serialization
later on, so that the stuff can run on more memory constrained
devices.
The third major stage, rendering engine, is again mostly CPU bound,
but at the same time it's I/O bound as well when outputting the
result.

All three major parts are more or less independent from each other and
can run simultaneously, just with a bit of a delay. I can perform
calculations while waiting for the next record and I can also start
rendering immediately after I have all the data for the first group
available.

I may use multiprocessing, but I believe it introduces more
communication overhead than threads and am so reluctant to go there.
Threads were perfect, other stuff wasn't. To make things worse, no
particular extension / fork / branch helps me here. So if I wanted to
just do the stuff in Python, I'd have to move to Jthon or IronPython
and hope cPython eventually improves in this area. I do actually need
cPython since the other two aren't supported on all platforms my
company intends to support.

The main issue I currently have with GIL is that execution time is
worse when I use threading. Had it been the same, I wouldn't worry too
much about it. Waiting for a permenent solution would be much easier
then...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-20 Thread Jure Erznožnik
Add:
Carl, Olivier  co. - You guys know exactly what I wanted.
Others: Going back to C++ isn't what I had in mind when I started
initial testing for my project.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie queue question

2009-06-19 Thread Jure Erznožnik
I've done some further testing on the subject:

I also added some calculations in the main loop to see what effect
they would have on speed. Of course, I also added the same
calculations to the single threaded functions.
They were simple summary functions, like average, sum, etc. Almost no
interaction with the buffers was added, just retrieval of a single
field's value.

Single threaded, the calculations added another 4.3 seconds to the
processing time (~18%)
MultiThreaded, they added 1.8 seconds. CPU usage remained below 100%
of one core at all times. Made me check the process affinity.

I know the main thread uses way less CPU than DBF reading thread (4
secs vs 22 secs).
So I think adding these calculations should have but a minimal impact
on threaded execution time.

Instead, the execution time increases!!!
I'm beginning to think that Python's memory management / functions
introduce quite a significant overhead for threading.

I think I'll just write this program in one of the compilers today to
verify just how stupid I've become.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie queue question

2009-06-19 Thread Jure Erznožnik
Digging further, I found this:
http://www.oreillynet.com/onlamp/blog/2005/10/does_python_have_a_concurrency.html

Looking up on this info, I found this:
http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock

If this is correct, no amount of threading would ever help in Python
since only one core / CPU could *by design* ever be utilized. Except
for the code that accesses *no* functions / memory at all.

This does seem to be a bit harsh though.
I'm now writing a simple test program to verify this. Multiple data-
independed threads just so I can see if more than one core can at all
be utilized.

:(
-- 
http://mail.python.org/mailman/listinfo/python-list


Status of Python threading support (GIL removal)?

2009-06-19 Thread Jure Erznožnik
See here for introduction:
http://groups.google.si/group/comp.lang.python/browse_thread/thread/370f8a1747f0fb91

Digging through my problem, I discovered Python isn't exactly thread
safe and to solve the issue, there's this Global Interpreter Lock
(GIL) in place.
Effectively, this causes the interpreter to utilize one core when
threading is not used and .95 of a core when threading is utilized.

Is there any work in progess on core Python modules that will
permanently resolve this issue?
Is there any other way to work around the issue aside from forking new
processes or using something else?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-19 Thread Jure Erznožnik
Thanks guys, for all the replies.
They were some very interesting reading / watching.

Seems to me, the Unladen-Swallow might in time produce code which will
have this problem lessened a bit. Their roadmap suggests at least
modifying the GIL principles if not fully removing it. On top of this,
they seem to have a pretty aggressive schedule with good results
expected by Q3 this year. I'm hoping that their patches will be
accepted to cPython codebase in a timely manner. I definitely liket
the speed improvements they showed for Q1 modifications. Though those
improvements don't help my case yet...

The presentation from mr. Beasley was hilarious :D
I find it curious to learn that just simple replacement from events to
actual mutexes already lessens the problem a lot. This should already
be implemented in the cPython codebase IMHO.

As for multiprocessing alternatives, I'll have to look into them. I
haven't yet done multiprocessing code and don't really know what will
happen when I try. I believe that threads would be much more
appropriate for my project, but it's definitely worth a shot. Since my
project is supposed to be cross platform, I'm not really looking
forward to learning cross platform for C++. All my C++ experience is
DOS + Windows derivatives till now :(
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-19 Thread Jure Erznožnik
Sorry, just a few more thoughts:

Does anybody know why GIL can't be made more atomic? I mean, use
different locks for different parts of code?
This way there would be way less blocking and the plugin interface
could remain the same (the interpreter would know what lock it used
for the plugin, so the actual function for releasing / reacquiring the
lock could remain the same)
On second thought, forget this. This is probably exactly the cause of
free-threading reduced performance. Fine-graining the locks increased
the lock count and their implementation is rather slow per se. Strange
that *nix variants don't have InterlockedExchange, probably because
they aren't x86 specific. I find it strange that other architectures
wouldn't have these instructions though... Also, an OS should still be
able to support such a function even if underlying architecture
doesn't have it. After all, a kernel knows what it's currently running
and they are typically not preempted themselves.

Also, a beside question: why does python so like to use events instead
of true synchronization objects? Almost every library I looked at
used that. IMHO that's quite irrational. Using objects that are
intended for something else for the job while there are plenty of
true options supported in every OS out there.

Still, the free-threading mod could still work just fine if there was
just one more global variable added: current python thread count. A
simple check for value greater than 1 would trigger the
synchronization code, while having just one thread would introduce no
locking at all. Still, I didn't like the performance figures of the
mod (0.6 execution speed, pretty bad core / processor scaling)

I don't know why it's so hard to do simple locking just for writes to
globals. I used to do it massively and it always worked almost with no
penalty at all. It's true that those were all Windows programs, using
critical sections.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-19 Thread Jure Erznožnik
On Jun 19, 11:45 pm, OdarR olivier.da...@gmail.com wrote:
 On 19 juin, 21:05, Christian Heimes li...@cheimes.de wrote:

  I've seen a single Python process using the full capacity of up to 8
  CPUs. The application is making heavy use of lxml for large XSL
  transformations, a database adapter and my own image processing library
  based upon FreeImage.

 interesting...

  Of course both lxml and my library are written with the GIL in mind.
  They release the GIL around every call to C libraries that don't touch
  Python objects. PIL releases the lock around ops as well (although it
  took me a while to figure it out because PIL uses its own API instead of
  the standard macros). reportlab has some optional C libraries that
  increase the speed, too. Are you using them?

 I don't. Or maybe I did, but I have no clue what to test.
 Do you have a real example, some code snippet to can prove/show
 activity on multiple core ?
 I accept your explanation, but I also like experiencing :)

  By the way threads are evil
  (http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf) and
  not *the* answer to concurrency.

 I don't see threads as evil from my little experience on the subject,
 but we need them.
 I'm reading what's happening in the java world too, it can be
 interesting.

 Olivier

Olivier,
What Christian is saying is that you can write a C/C++ Python plugin,
release the GIL inside it and then process stuff in threads inside the
plugin.
All this is possible if the progammer doesn't use any Python objects
and it's fairly easy to write such a plugin. Any counting example will
do just fine.

The problem with this solution is that you have to write the code in C
which quite defeats the purpose of using an interpreter in the first
place...
Of course, no pure python code will currently utilize multiple cores
(because of GIL).

I do aggree though that threading is important. Regardless of any
studies showing that threads suck, they are here and they offer
relatively simple concurrency. IMHO they should never have been
crippled like this. Even though GIL solves access violations, it's not
the right approach. It simply kills all threading benefits except for
the situation where you work with multiple I/O blocking threads.
That's just about the only situation where this problem is not
apparent.

We're way past single processor single core computers now. An
important product like Python should support these architectures
properly even if only 1% of applications written in it use threading.

But as Guido himself said; I should not complain but instead try to
contribute to solution. That's the hard part, especially since there's
lots of code that actually need the locking.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-19 Thread Jure Erznožnik
On Jun 19, 11:59 pm, Jesse Noller jnol...@gmail.com wrote:
 On Fri, Jun 19, 2009 at 12:50 PM, OdarRolivier.da...@gmail.com wrote:
  On 19 juin, 16:16, Martin von Loewis martin.vonloe...@hpi.uni-:
  If you know that your (C) code is thread safe on its own, you can
  release the GIL around long-running algorithms, thus using as many
  CPUs as you have available, in a single process.

  what do you mean ?

  Cpython can't benefit from multi-core without multiple processes.

  Olivier

 Sorry, you're incorrect. I/O Bound threads do in fact, take advantage
 of multiple cores.

Incorrect. They take advantage of OS threading support where another
thread can run while one is blocked for I/O.
That is not equal to running on multiple cores (though it actually
does do that, just that cores are all not well utilized - sum(x) 
100% of one core).
You wil get better performance running on single core because of the
way GIL is implemented in all cases.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Status of Python threading support (GIL removal)?

2009-06-19 Thread Jure Erznožnik
On Jun 20, 1:36 am, a...@pythoncraft.com (Aahz) wrote:

 You should put up or shut up -- I've certainly seen multi-core speedup
 with threaded software, so show us your benchmarks!
 --

Sorry, no intent to offend anyone here. Flame wars are not my thing.

I have shown my benchmarks. See first post and click on the link.
That's the reason I started this discussion.

All I'm saying is that you can get threading benefit, but only if the
threading in question is implemented in C plugin.
I have yet to see pure Python code which does take advantage of
multiple cores. From what I read about GIL, this is simply impossible
by design.

But I'm not disputing the fact that cPython as a whole can take
advantage of multiple cores. There certainly are built-in objects that
work as they should.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie queue question

2009-06-18 Thread Jure Erznožnik
Thanks for the suggestions.
I've been looking at the source code of threading support objects and
I saw that non-blocking requests in queues use events, while blocking
requests just use InterlockedExchange.
So plain old put/get is much faster and I've managed to confirm this
today with further testing.

Sorry about the semicolon, just can't seem to shake it with my pascal
 C++ background :)

Currently, I've managed to get the code to this stage:

class mt(threading.Thread):

q = Queue.Queue()
def run(self):
dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1)
for i1 in xrange(len(dbf1)):
self.q.put(dbf1[i1])
dbf1.close()
del dbf1
self.q.put(None)

t = mt()
t.start()
time.sleep(22)
rec = 1
while rec  None:
rec = t.q.get()

del t

Note the time.sleep(22). It takes about 22 seconds to read the DBF
with the 200K records (71MB). It's entirely in cache, yes.

So, If I put this sleep in there, the whole procedure finishes in 22
seconds with 100% CPU (core) usage. Almost as fast as the single
threaded procedure. There is very little overhead.
When I remove the sleep, the procedure finishes in 30 seconds with
~80% CPU (core) usage.
So the threading overhead only happens when I actually cause thread
interaction.

This never happened to me before. Usually (C, Pascal) there was some
threading overhead, but I could always measure it in tenths of a
percent. In this case it's 50% and I'm pretty sure InterlockedExchange
is the fastest thing there can be.

My example currently really is a dummy one. It doesn't do much, only
the reading thread is implemented, but that will change with time.
Reading the data source is one task, I will proceed with calculations
and with a rendering engine, both of which will be pretty CPU
intensive as well.

I'd like to at least make the reading part behave like I want it to
before I proceed. It's clear to me I don't understand Python's
threading concepts yet.

I'd still appreciate further advice on what to do to make this sample
work with less overhead.
-- 
http://mail.python.org/mailman/listinfo/python-list


Newbie queue question

2009-06-17 Thread Jure Erznožnik
Hi,
I'm pretty new to Python (2.6) and I've run into a problem I just
can't seem to solve.
I'm using dbfpy to access DBF tables as part of a little test project.
I've programmed two separate functions, one that reads the DBF in main
thread and the other which reads the DBF asynchronously in a separate
thread.
Here's the code:

def demo_01():
'''DBF read speed only'''

dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1)
for i1 in xrange(len(dbf1)):
rec = dbf1[i1]
dbf1.close()

def demo_03():
'''DBF read speed into a FIFO queue'''

class mt(threading.Thread):

q = Queue.Queue(64)
def run(self):
dbf1 = Dbf('D:\\python\\testdbf\\promet.dbf', readOnly=1)
for i1 in xrange(len(dbf1)):
self.q.put(dbf1[i1])
dbf1.close()
del dbf1
self.q.join()

t = mt()
t.start()
while t.isAlive():
try:
rec = t.q.get(False, 0.2)
t.q.task_done();
except:
pass

del t


However I'm having serious issues with the second method. It seems
that as soon as I start accessing the queue from both threads, the
reading speed effectively halves.

I have tried the following:
1. using deque instead of queue (same speed)
2. reading 10 records at a time and inserting them in a separate loop
(hoped the congestion would help)
3. Increasing queue size to infinite and waiting 10 seconds in main
thread before I started reading - this one yielded full reading speed,
but the waiting took away all the threading benefits

I'm sure I'm doing something very wrong here, I just can't figure out
what.

Can anyone help me with this?

Thanks,
Jure
-- 
http://mail.python.org/mailman/listinfo/python-list