date:20150107

Re: how to create a soap enveloppe with python suds for consume a bssv web service

2015-01-07 Thread dieter

brice DORA  writes:

> Hi all I am working on a app in python and I have to call a web service 
> deployed on JDE (bssv). I use it for the suds lib which seems pretty 
> friendly. but the problem is the JDE web service that uses bssv technology 
> necessarily requires sending a soap envelope. as far as I spend my fesais 
> parameters required in my suds client. My concern then is how to work around 
> this problem, or is this possible to create a soap envelope with suds? thank 
> you in advance

"suds" should generate those required "SOAP envelope".

"suds" can be set up to log the precise messages sent and received
(consult the "suds" documentation about logging and the Python
documentation about its "logging" module). With these messages
(and a precise description of the encountered problem), you can
contact someone responsible for the web service to resolve your problems.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Announce: PyPrimes 0.2.1a

2015-01-07 Thread Christian Gollwitzer


Hi Steve,

Am 08.01.15 um 05:35 schrieb Steven D'Aprano:

At long last, I am pleased to announce the latest version of PyPrimes, a
pure Python package for working with prime numbers.


Nice.


PyPrimes is compatible with Python 2 and 3, and includes multiple
algorithms for the generating and testing of prime numbers, including the
Sieve of Eratosthenes, Croft Spiral, Miller-Rabin and Fermat
probabilistic tests.

https://pypi.python.org/pypi/pyprimes/


I don't want to badmouth your effort, but it seems to me that this is 
still a collectino of rather simple algorithms. What about the AKS test, 
which is O(P) and deterministic for all primes, what about elliptic 
curve factorization or a quadratic sieve?


I'm sure that other people with better knowledge of number theory could 
propose some more generally useful algorithms.


Christian

--
https://mail.python.org/mailman/listinfo/python-list

Re: PyGILState API and Py_Main

2015-01-07 Thread dieter

Adrien Bruneton  writes:

> I am having a hard time understanding what is the proper use of
> PyGILState_Ensure/Release.
> My understanding is that one should always be matched with the other,
> and that this high level API auto-magically deals with the ThreadState
> creation.
>
> However the following piece of code (executed with a simple "print
> 'hello world' " script as argv) triggers the message:
>
> Fatal Python error: auto-releasing thread-state, but no
> thread-state for this thread

Each function at the Python C interface has (with high probability)
a notion whether it is called from Python (GIL acquired) or from pure C code
(GIL not acquired). When you call the function yourself, you must match
this expectation. "Py_Main" likely expects to be called without
acquired GIL. Thus, likely, you should not call "PyGILState_Ensure"
before "Py_Main".

> Minimal code:
>
> void initPython(int initsigs)
> {
>   if (Py_IsInitialized() == 0)
> {
>   Py_InitializeEx(initsigs);
>   // Put default SIGINT handler back after
> Py_Initialize/Py_InitializeEx.
>   signal(SIGINT, SIG_DFL);
> }
>
>   int threadInit = PyEval_ThreadsInitialized();
>   PyEval_InitThreads(); // safe to call this multiple time
>
>   if(!threadInit)
> PyEval_SaveThread(); // release GIL
> }
>
> int main(int argc, char ** argv)
> {
>   initPython(1);
>   PyGILState_STATE _gstate_avoid_clash = PyGILState_Ensure();
>   int ret = Py_Main(argc, argv);
>   PyGILState_Release(_gstate_avoid_clash);  // this one triggers the
> Fatal error
>   Py_Finalize();
>   return ret;
> }
>
>
> Removing the last PyGILState_Release works, but I have a bad feeling
> about it :-)
> Any help would be welcome! Thanks in advance.

Likely, "Py_Main" has already released the GIL and cleaned up
the thread state.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Announce: PyPrimes 0.2.1a

2015-01-07 Thread Ben Finney

Steven D'Aprano  writes:

> (Note: pip may have problems downloading the right version if you
> don't specify a version number.)
>
> Or you can access the latest development version:
>
> hg clone https://code.google.com/p/pyprimes/

The source has a ‘CHANGES.txt’ file which has no entry later than
version 0.2a. Why was the later version made, and when will the change
log be updated for that?

-- 
 \  “Ignorance more frequently begets confidence than does |
  `\   knowledge.” —Charles Darwin, _The Descent of Man_, 1871 |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-07 Thread Marko Rauhamaa

Steven D'Aprano :

> Marko Rauhamaa wrote:
>> I prefer the Scheme way:
>>#f is a falsey object
>>everything else is a truthy object
>
> The Scheme way has no underlying model of what truthiness represents, just
> an arbitrary choice to make a single value have one truthiness, and
> everything else the other. It's just as meaningless and just as arbitrary
> as the opposite would be:
>
> #t is True
> everything else is falsey
> [...]
> I'd rather the Pascal way:
>
> #t is True
> #f is False
> everything else is an error

An advantage of the Scheme way is the chaining of "and" and "or". For
example, this breaks in Python:

   def dir_contents(path):
   if os.path.isdir(path):
   return os.listdir(path)
   return None

   def get_choices():
   return dir_contents(PRIMARY) or \
   dir_contents(SECONDARY) or \
   [ BUILTIN_PATH ]


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Announce: PyPrimes 0.2.1a

2015-01-07 Thread Steven D'Aprano

At long last, I am pleased to announce the latest version of PyPrimes, a 
pure Python package for working with prime numbers.

PyPrimes is compatible with Python 2 and 3, and includes multiple 
algorithms for the generating and testing of prime numbers, including the 
Sieve of Eratosthenes, Croft Spiral, Miller-Rabin and Fermat 
probabilistic tests.

https://pypi.python.org/pypi/pyprimes/


Examples



Generate prime numbers on demand:

py> it = pyprimes.primes()
py> next(it)
2
py> next(it)
3


Test whether numbers are prime:

py> pyprimes.is_prime(23)
True


Generate primes between an upper and lower limit:

py> import pyprimes
py> list(pyprimes.primes(1100, 11000500))
[1127, 1153, 1157, 1181, 1183, 1189, 11000111,
 11000113, 11000149, 11000159, 11000179, 11000189, 11000273, 11000281,
 11000287, 11000291, 11000293, 11000299, 11000347, 11000351, 11000369,
 11000383, 11000387, 11000393, 11000399, 11000401, 11000419, 11000441,
 11000461, 11000467]


Find the previous and next primes from a given value:

py> pyprimes.prev_prime(10**9)
99937
py> pyprimes.next_prime(99937)
17


Find the prime factorisation of small numbers:

py> import pyprimes.factors
py> pyprimes.factors.factorise(7)
[71, 2251, 6257]


Pyprimes also includes examples of naive and poor-quality, but popular, 
algorithms which should be avoided:

py> from pyprimes.awful import turner
py> it = turner()
py> [next(it) for i in range(20)]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71]

Study the source code of the "awful" module to see what not to do!


Performance
===

As PyPrimes is written entirely in Python, it is not suitable for 
demanding high-performance applications, but performance can be quite 
reasonable for less-demanding applications. On a laptop with a AMD Turion
(tm) II P520 Dual-Core Processor, it takes 0.005 second to generate the 
first prime larger than 2**63, and about 10 seconds to generate the first 
one million primes. Testing whether 2**63+7 is prime takes under 0.1 
millisecond.


Installing PyPrimes
===

You can download PyPrimes from here:

https://pypi.python.org/pypi/pyprimes/

For Windows users, run the installer. For Linux, Mac and Unix users, 
unpack the source tar ball and follow the instructions.

You can also install using pip:

pip install pyprimes==0.2.1a


(Note: pip may have problems downloading the right version if you don't 
specify a version number.)

Or you can access the latest development version:

hg clone https://code.google.com/p/pyprimes/




-- 
Steve
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Playing with threads

2015-01-07 Thread Terry Reedy


On 1/7/2015 9:00 PM, Ganesh Pal wrote:

Hi friends,

I'm trying to use threads  to achieve the below work flow

1. Start a process , while its running grep for a string 1
2. Apply the string 1 to the command in step 1 and exit step 2
3. Monitor the stdout of step1 and print success if the is pattern  found

Questions:

1. Can the above be achieved without threads ? I prefer keep ing code
simple  .threads can become confusion when this workflow grows larger


I do not understand the work flow either.  What I do know is that Idle 
Edit -> Find in Files is implemented in idlelib/grep.py and that the 
non-gui code could be copied and adapted to other purposes.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: application console with window

2015-01-07 Thread Emil Oppeln-Bronikowski

On Wed, Jan 07, 2015 at 10:17:09PM +0100, adam wrote:
> Is in here maybe someone who speak Polish?

Mówię, a raczej piszę.

> I'm looking for some libs, tutorials, or other informations.

This is terminal application using (n)curses or smilar library that helps you 
draw & interact with windows, forms and other widgets.

You can use curses library, but it's a little big hairy. There are a few extra 
libraries that make writing "full-screen" terminal application a breeze.

The most popular one (…and complete?) is urwid, and there are pages of tutorial 
once you google for it. 

PS. if you need some help feel free to e-mail me off-list.

-- 
vag·a·bond adjective \ˈva-gə-ˌbänd\
 a :  of, relating to, or characteristic of a wanderer 
 b :  leading an unsettled, irresponsible, or disreputable life

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Playing with threads

2015-01-07 Thread Chris Angelico

On Thu, Jan 8, 2015 at 1:00 PM, Ganesh Pal  wrote:
> I'm trying to use threads  to achieve the below work flow
>
> 1. Start a process , while its running grep for a string 1
> 2. Apply the string 1 to the command in step 1 and exit step 2
> 3. Monitor the stdout of step1 and print success if the is pattern  found

If by "grep" you mean "search", then this would be best done by
spawning a subprocess and monitoring its output.

https://docs.python.org/3/library/subprocess.html

Set stdout=PIPE, then read from the stdout member, and you'll be
reading the program's output. Something like this:

with subprocess.Popen(["find","/","-name","*.py"],stdout=subprocess.PIPE)
as findpy:
for line in findpy.stdout:
# do something with line

You'll get lines as they're produced, and can use standard string
manipulation on them. I don't know what your step 2 is, but you can
set stdin to be a pipe as well, and then you just write to your end
and the process will receive that on its standard input.

No threads needed; just your Python process and the subprocess.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Playing with threads

2015-01-07 Thread Dave Angel


On 01/07/2015 09:00 PM, Ganesh Pal wrote:

Hi friends,

I'm trying to use threads  to achieve the below work flow

1. Start a process , while its running grep for a string 1
2. Apply the string 1 to the command in step 1 and exit step 2
3. Monitor the stdout of step1 and print success if the is pattern  found



None of those three "statements" make sense to me.  Could you translate, 
or elaborate?  And fix typos?


What OS is this running on?

Is the process a program called grep.exe?  Or a second Python program? 
Who's doing the grep, that separate program or your original one?


Define Apply.  What's it mean to apply a string to a command?

What's an "is pattern" ?  And who's looking for it?

What's the real goal?  Or is this just a paraphrasing of an arbitrary 
school assignment (which don't always make sense).



Questions:

1. Can the above be achieved without threads ? I prefer keep ing code
simple  .threads can become confusion when this workflow grows larger


Sure, nothing in that description implies threads, just a separate 
process.  No idea why that's simpler, though.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: Playing with threads

2015-01-07 Thread Devin Jeanpierre

I can't tell what you mean, but you can start a process via
subprocess.Popen, do some work, and wait for it to finish.
https://docs.python.org/2/library/subprocess.html

Note that you don't need the stdout (or likely the stdin) of the
process, you just need the return code -- whether or not grep
succeeded -- so you can redirect stdout and stderr to os.devnull and
avoid using .communicate().  Also, if you can't use .communicate(),
but need to access stdout, this is the most common reason to need
threads with subprocess.

-- Devin

On Wed, Jan 7, 2015 at 8:00 PM, Ganesh Pal  wrote:
> Hi friends,
>
> I'm trying to use threads  to achieve the below work flow
>
> 1. Start a process , while its running grep for a string 1
> 2. Apply the string 1 to the command in step 1 and exit step 2
> 3. Monitor the stdout of step1 and print success if the is pattern  found
>
> Questions:
>
> 1. Can the above be achieved without threads ? I prefer keep ing code simple
> .threads can become confusion when this workflow grows larger
>
> Regards,
> Ganesh
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Playing with threads

2015-01-07 Thread Ganesh Pal

Hi friends,

I'm trying to use threads  to achieve the below work flow

1. Start a process , while its running grep for a string 1
2. Apply the string 1 to the command in step 1 and exit step 2
3. Monitor the stdout of step1 and print success if the is pattern  found

Questions:

1. Can the above be achieved without threads ? I prefer keep ing code
simple  .threads can become confusion when this workflow grows larger

Regards,
Ganesh
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: pickle, modules, and ImportErrors

2015-01-07 Thread Chris Angelico

On Thu, Jan 8, 2015 at 11:23 AM, John Ladasky
 wrote:
>> P.S. don't use pickle, it is a security vulnerability equivalent in
>> severity to using exec in your code, and an unversioned opaque
>> schemaless blob that is very difficult to work with when circumstances
>> change.
>
> For all of its shortcomings, I can't live without pickle.  In this case, I am 
> doing data mining.  My TrainingSession class commandeers seven CPU cores via 
> Multiprocessing.Pool.  Still, even my "toy" TrainingSessions take several 
> minutes to run.  I can't afford to re-run TrainingSession every time I need 
> my models.  I need a persistent object.
>
> Besides, the opportunity for mischief is low.  My code is for my own personal 
> use.  And I trust the third-party libraries that I am using.  My SVRModel 
> object wraps the NuSVR object from scikit-learn, which in turn wraps the 
> libsvm binary.

There are several issues, not all of which are easily dodged. Devin cited two:

* Security: it's fundamentally equivalent to using 'exec'
* Unversioned: it's hard to make updates to your code and then load old data

"For your own personal use" dodges the first one, but makes the second
one even more of a concern. You can get much better persistence using
a textual format like JSON, and adding in a simple 'version' member
can make it even easier. Then, when you make changes, you can cope
with old data fairly readily.

Pickle is still there if you want it, but you do have to be aware of
its limitations. If you edit the TrainingSession class, you may well
have to rerun the training... but maybe that's not a bad thing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: pickle, modules, and ImportErrors

2015-01-07 Thread John Ladasky

On Wednesday, January 7, 2015 12:56:29 PM UTC-8, Devin Jeanpierre wrote:

[snip]

> If you never run model directly, and only ever import it or run it as
> my_svr.model, then you will be fine, and pickles will all serialize
> and deserialize the same way.

Thank you Devin... I re-ran TrainingSession from within ipython, and everything 
worked the way I hoped.  I obtained an SVRModel object which I could pickle to 
disk, and unpickle in a subsequent session.  I don't actually need to run the 
test code that I appended to training.py any more, so I won't.

[snip]

> P.S. don't use pickle, it is a security vulnerability equivalent in
> severity to using exec in your code, and an unversioned opaque
> schemaless blob that is very difficult to work with when circumstances
> change.

For all of its shortcomings, I can't live without pickle.  In this case, I am 
doing data mining.  My TrainingSession class commandeers seven CPU cores via 
Multiprocessing.Pool.  Still, even my "toy" TrainingSessions take several 
minutes to run.  I can't afford to re-run TrainingSession every time I need my 
models.  I need a persistent object.

Besides, the opportunity for mischief is low.  My code is for my own personal 
use.  And I trust the third-party libraries that I am using.  My SVRModel 
object wraps the NuSVR object from scikit-learn, which in turn wraps the libsvm 
binary.

> > try:
> > from model import *
> > from sampling import *
> > except ImportError:
> > from .model import *
> > from .sampling import *
> >
> >
> > This bothers me.  I don't know whether it is correct usage.  I don't know 
> > whether it is causing my remaining ImportError problem.
> 
> This is a symptom of the differing ways you are importing these
> modules, as above. If you only ever run them and import them as
> my_svr.blahblah, then only the second set of imports are necessary.

OK, I will try refactoring the import code at some point.

> I hope that resolves all your questions!

I think so, thanks.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-07 Thread Ethan Furman

On 01/06/2015 07:37 PM, Andrew Robinson wrote:

> Explain; How does mere subclassing of bool break the contract that bool has?
> eg: What method or data would the superclass have that my subclass would not?

bool's contract is that there are only two values (True and False) and only one 
instance each of those two values (True
and False).  If bool were subclassable, new values could be added with either 
completely different values (PartTrue) or
with more of the same value (True, ReallyTrue, AbsolutelyTrue) -- hence, broken 
contract.

--
~Ethan~

signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-07 Thread Steven D'Aprano

Marko Rauhamaa wrote:

> Steven D'Aprano :
> 
>> int 0 is a falsey object
>> NoneType None is a falsey object
>> str 'hello' is a truthy object
>> float 23.0 is a truthy object
> 
> I prefer the Scheme way:
> 
>#f is a falsey object
> 
>everything else is a truthy object

The Scheme way has no underlying model of what truthiness represents, just
an arbitrary choice to make a single value have one truthiness, and
everything else the other. It's just as meaningless and just as arbitrary
as the opposite would be:

#t is True
everything else is falsey

In both cases, you have the vast infinity of values apart from #f (or #t, as
the case may be) which are indistinguishable from each other under the
operation of "use in a boolean context". In other words, apart from #f or
#t, bool(x) maps everything to a single value. That makes it useless for
anything except distinguishing #f (or #t) from "everything else".

(I'm mixing scheme and python here, but I trust my meaning is clear.)

Given x of some type other than the Boolean type, bool(x) always gives the
same result. Since all non-Booleans are indistinguishable under that
operation, it is pointless to apply that operation to them.

I'd rather the Pascal way:

#t is True
#f is False
everything else is an error

That at least gives you the benefits (if any) of strongly-typed bools.

Python has a (mostly) consistent model for truthiness: truthy values
represent "something", falsey values represent "nothing" or emptiness:

Falsey values:
  None
  Numeric zeroes: 0, 0.0, 0j, Decimal(0), Fraction(0)
  Empty strings '', u''
  Empty containers [], (), {}, set(), frozenset()

Truthy values:
  Numeric non-zeroes
  Non-empty strings
  Non-empty containers
  Any other arbitrary object

The model isn't quite perfect (I don't believe any model using truthiness
can be) but the number of gotchas in the built-ins and standard library are
very small. I can only think of two:

- datetime.time(0) is falsey. Why midnight should be falsey is an 
  accident of implementation: datetime.time objects inherit from 
  int, and midnight happens to be represented by zero seconds.

- Empty iterators are truthy. Since in general you can't tell in 
  advance whether an iterator will be empty or not until you try 
  it, this makes sense.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: application console with window

2015-01-07 Thread jacek pozniak

adam wrote:

> Is in here maybe someone who speak Polish?
> 
> I would like to write application witch looks like this
> http://linuxiarze.pl/obrazy/internet1/ceni1.png
Jeśli chodzi Ci o przeniesienie na wersję okienkową to na przykład: tkinter.

jp

> 
> I'm looking for some libs, tutorials, or other informations.
> I'm searching this informations for python3.
> 
> adam

-- 
https://mail.python.org/mailman/listinfo/python-list

application console with window

2015-01-07 Thread adam

Is in here maybe someone who speak Polish?

I would like to write application witch looks like this 
http://linuxiarze.pl/obrazy/internet1/ceni1.png

I'm looking for some libs, tutorials, or other informations.
I'm searching this informations for python3.

adam
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: pickle, modules, and ImportErrors

2015-01-07 Thread Ian Kelly

On Wed, Jan 7, 2015 at 1:12 PM, John Ladasky  wrote:
> Do I need to "import my_svr.model as model" then?  Adding that line changes 
> nothing. I get the exact same "ImportError: No module named 'model'".
>
> Likewise for "import my_svr", "from my_svr import *", or even "from 
> my_svr.model import SVRModel".
>
> It is clear that I'm failing to understand something important.
>
> I do not have any circular import dependencies; however, some of the files in 
> my package do need to import definitions from files earlier in my data 
> pipeline.  In order to make everything work inside the module, as well as 
> making a parent-folder "import my_svr" work from a iPython,  I find myself 
> needing to use statements like these inside my training.py program:
>
>
> try:
> from model import *
> from sampling import *
> except ImportError:
> from .model import *
> from .sampling import *
>
>
> This bothers me.  I don't know whether it is correct usage.  I don't know 
> whether it is causing my remaining ImportError problem.

It sounds like you have import path confusion. The first imports there
are absolute imports. If the absolute paths of the named modules are
just "model" and "sampling", then that would be correct. However, it
sounds like these modules are part of a "my_svr" package, so the
absolute paths of these modules are actually "my_svr.model" and
"my_svr.sampling". In that case, the the absolute path that you're
trying to use for the import is wrong. Either fully specify the module
path, or use the relative import you have below.

Now, why might those absolute imports appear to work even though the
paths are incorrect? Well, it sounds like you may be running a file
inside the package as your main script. This causes some problems.

1) Python doesn't realize that the script it's running is part of a
package. It just calls that module '__main__', and if something else
happens to import the module later by its real name, then you'll end
up loading a second copy of the module by that name, which can lead to
confusing behavior.

2) Python implicitly adds the directory containing the script to the
front of the sys.path list. Since the directory containing the package
is presumably already on the sys.path, this means that the internal
modules now appear in sys.path *twice*, with two different names:
"my_svr.model" and "model" refer to the same source file. Python
doesn't realize this however, and so they don't refer to the same
module. As a result, importing both "my_svr.model" and "model" will
again result in two separate copies of the module being loaded, with
two different names.

It sounds like your pickle file was created from an object from the
"model" module. This works fine when the sys.path is set such that
"model" is the absolute path of a module, i.e. when you run your main
script within the package. When running an external main script,
sys.path doesn't get set that way, and so it can't find the "model"
module.

The solution to this is to avoid executing a module within a package
as your main script. Run a script outside the package and have it
import and call into the package instead. An alternative is to run the
module using the -m flag of the Python executable; this fixes problem
2 above but not problem 1.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: pickle, modules, and ImportErrors

2015-01-07 Thread Devin Jeanpierre

On Wed, Jan 7, 2015 at 2:12 PM, John Ladasky  wrote:
> If I execute "import my_svr" in an iPython interpreter, everything works as I 
> think that I should expect:
-snip-
> However, a nearly-identical program in the parent folder fails (note that all 
> I change is the relative path to the file):
> Traceback (most recent call last):
>   File "reload test different directory.py", line 6, in 
> model = load(f)
> ImportError: No module named 'model'
>
>
> Do I need to "import my_svr.model as model" then?  Adding that line changes 
> nothing. I get the exact same "ImportError: No module named 'model'".
>
> Likewise for "import my_svr", "from my_svr import *", or even "from 
> my_svr.model import SVRModel".
>
> It is clear that I'm failing to understand something important.

in the first case, the model module was available as a top-level
module, "model". Pickles referenced that module when serialized. In
the second case, the model module was available as a submodule of the
top level my_svr package. So any pickles serialized from there would
use my_svr.model to refer to the model module. There *is* no model
module in this second case, so deserializing fails.

If you never run model directly, and only ever import it or run it as
my_svr.model, then you will be fine, and pickles will all serialize
and deserialize the same way.

For example, instead of python -i my_svr/model.py, you can use python
-im my_svr.model . (or ipython -im my_svr.model).

P.S. don't use pickle, it is a security vulnerability equivalent in
severity to using exec in your code, and an unversioned opaque
schemaless blob that is very difficult to work with when circumstances
change.

> I do not have any circular import dependencies; however, some of the files in 
> my package do need to import definitions from files earlier in my data 
> pipeline.  In order to make everything work inside the module, as well as 
> making a parent-folder "import my_svr" work from a iPython,  I find myself 
> needing to use statements like these inside my training.py program:
>
>
> try:
> from model import *
> from sampling import *
> except ImportError:
> from .model import *
> from .sampling import *
>
>
> This bothers me.  I don't know whether it is correct usage.  I don't know 
> whether it is causing my remaining ImportError problem.

This is a symptom of the differing ways you are importing these
modules, as above. If you only ever run them and import them as
my_svr.blahblah, then only the second set of imports are necessary.

P.S. don't use import *, and if you do use import *, don't use more
than one per file -- it makes it really hard to figure out where a
given global came from (was it defined here? was it defined in model?
was it defined in sampling?)

I hope that resolves all your questions!

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list

pickle, modules, and ImportErrors

2015-01-07 Thread John Ladasky

I am progressing towards organizing a recent project of mine as a proper Python 
package.  It is not a huge package, about 700 lines of code so far.  But it 
breaks into logical pieces, and I don't feel like scrolling back and forth 
through a 700-line file.

I am running Python 3.4.0 on Ubuntu 14.04, if it matters.  

I want a package which I can use in an iPython session, as well as in programs 
which share the package directory.  Compatibility within the package directory 
appears to be easy.  From outside the package, I am getting ImportErrors that I 
have not been able to fix.

Although I have used pickle often, this is the first time that I have written a 
package with an __init__.py.  It reads thus:


# __init__.py for my_svr module

from .database import MasterDatabase
from .sampling import Input
from .sampling import MultiSample
from .sampling import Sample
from .sampling import Target
from .sampling import DataPoint
from .model import Prediction
from .model import SVRModel
from .model import comparison
from .training import TrainingSession

__all__ = ("MasterDatabase", "Input", "MultiSample", "Sample", "Target",
   "DataPoint", "Prediction", "SVRModel", "comparison",
   "TrainingSession")


If I execute "import my_svr" in an iPython interpreter, everything works as I 
think that I should expect:


In [8]: dir(my_svr)
Out[8]: 
['Input',
 'MasterDatabase',
 'MultiSample',
 'Prediction',
 'SVRModel',
 'Sample',
 'Target',
 'DataPoint',
 'TrainingSession',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'comparison',
 'database',
 'model',
 'sampling',
 'training']



My training.py program produces an SVRModel object, and pickles it to a binary 
file.  The following simple program will unpickle the file saved by 
training.py, provided that the program resides in the module's own folder:


# reload_test.py

from pickle import load

with open("../models/sample model.pkl", "rb") as f:
model = load(f)
print(model)


And, I get the str representation of my SVRModel instance.

However, a nearly-identical program in the parent folder fails (note that all I 
change is the relative path to the file):


# parent_folder_reload_test.py

from pickle import load

with open("models/sample model.pkl", "rb") as f:
model = load(f)
print(model)


Traceback (most recent call last):
  File "reload test different directory.py", line 6, in 
model = load(f)
ImportError: No module named 'model'


Do I need to "import my_svr.model as model" then?  Adding that line changes 
nothing. I get the exact same "ImportError: No module named 'model'".  

Likewise for "import my_svr", "from my_svr import *", or even "from 
my_svr.model import SVRModel".

It is clear that I'm failing to understand something important.

I do not have any circular import dependencies; however, some of the files in 
my package do need to import definitions from files earlier in my data 
pipeline.  In order to make everything work inside the module, as well as 
making a parent-folder "import my_svr" work from a iPython,  I find myself 
needing to use statements like these inside my training.py program:


try:
from model import *
from sampling import *
except ImportError:
from .model import *
from .sampling import *


This bothers me.  I don't know whether it is correct usage.  I don't know 
whether it is causing my remaining ImportError problem.

Any advice is appreciated.  Thanks!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: include "icudt53.dll, icuin53.dll, icuuc53.dll" ?

2015-01-07 Thread Albert-Jan Roskam



On Wed, Jan 7, 2015 2:09 PM CET Timothy W. Grove wrote:

>Does anyone have an idea of what the following .dll's are for? Cx_freeze 
>includes them in a Python3.4-PyQt5 deployment adding about 23 Mb to my 
>application. Removing them doesn't appear to make any difference on my 
>computer, but I hesitate to distribute the application to others without them. 
>Thanks for any response.
>
>Best regards,
>Tim Grove
>
>icudt53.dll
>icuin53.dll
>icuuc53.dll
>-- 

It is unicode related, see 
http://stackoverflow.com/questions/16736579/deploying-a-qt-project-without-icu-dependencies

https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Dave Angel


On 01/07/2015 08:38 AM, Jacob Kruger wrote:

Thanks.

Makes more sense now, and yes, using 2.7 here.

Unfortunately, while could pass the binary values into blob fields well
enough, using forms of parameterised statements, the actual generation
of sql script text files is a step they want to work with at times, if
someone is handling this on site, so had to work first with generating
string values, and then handle executing those statements against a
MySQL server later on using MySQLdb.



There must be an encoding method used for describing blobs in an sql 
statement.  Use exactly that method;  don't try to get creative.


For example in sqlite, use  sqlite.encode()

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Dave Angel


On 01/07/2015 08:32 AM, Jacob Kruger wrote:

Thanks.

Please don't top-post.  Put your responses after each quoted part you're 
responding to.  And if there are parts you're not responding to, please 
delete them.


Issue with knowing encoding could just be that am pretty sure at least
some of the data capture is done via copy/paste from one MS app to
another, which could possibly result in a whole bunch of different
character sets, etc. being copied across, so it comes down to that while
can't control sources of data, need to manipulate/work with it to make
it useful on our side now.



Copy/paste to/from properly written Windows programs is done in Unicode, 
so the problem should only be one of how the data was saved.  There, 
Windows is much more sloppy.


Chances are that a given machine will use a consistent encoding, so a 
given file should be consistent, unless it was used over a network.  And 
if all the machines that generate this data are in the same company, 
they might all use the same one as well.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Jacob Kruger


Thanks.

Yes, sorry didn't mention 2.7, and, unfortunately in this sense, all of this 
will be running on windows machines.


Stay well

Jacob Kruger
Blind Biker
Skype: BlindZA
"Roger Wilco wants to welcome you...to the space janitor's closet..."

- Original Message - 
From: "Dave Angel" 

To: 
Sent: Wednesday, January 07, 2015 2:22 PM
Subject: Re: String character encoding when converting data from one 
type/format to another




On 01/07/2015 06:04 AM, Jacob Kruger wrote:
I'm busy using something like pyodbc to pull data out of MS access .mdb 
files, and then generate .sql script files to execute


 against MySQL databases using MySQLdb module, but, issue is forms of 
characters in string values that don't fit inside


 the 0-127 range - current one seems to be something like \xa3, and if I 
pass it through ord() function,


 it comes out as character number 163.

First question, of course is what version of Python.  Clearly, you're not 
using Python 3.x, so I'll assume 2.7.  But you really should specify it in 
your query.


Next question is what OS you're using.  You're reading .mdb files, which 
are most likely created in Windows, but that doesn't guarantee you're 
actually using Windows to do this conversion.





Now issue is, yes, could just run through the hundreds of thousands of 
characters in these resulting strings, and strip out any that are not 
within the basic 0-127 range, but, that could result in corrupting data - 
think so anyway.


Anyway, issue is, for example, if I try something like 
str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or 
str('\xa3').encode('latin7') - that last one is actually our preferred 
encoding for the MySQL database - they all just tell me they can't work 
with a character out of range.




That's because your input data isn't ASCII.  So before you encode it, you 
have to decode it.  Any idea what encoding it's already in?  Maybe it's in 
latin1, which permits all 256 values.  Or utf-8, which permits a few 
hundred thousand values, but uses multiple bytes for any of those over 
127.  Or any of hundreds of other encodings.


Does an .mdb file have a field specifying what encoding was used?  Or do 
you have some other external knowledge?


If you don't know what encoding it's currently in, you'll have to guess, 
and the guess you're using so far is ASCII, which you know is false.


As for the encoding you should actually use in the database, that almost 
certainly ought to be utf-8, which supports far more international 
characters than latin1.  And make sure the database has a way to tell the 
future user what encoding you picked.


Any thoughts on a sort of generic method/means to handle any/all 
characters that might be out of range when having pulled them out of 
something like these MS access databases?


The only invalid characters are those which aren't valid in the encoding 
used.  Those can probably be safely converted to "?" or something similar.




Another side note is for binary values that might store binary values, I 
use something like the following to generate hex-based strings that work 
alright when then inserting said same binary values into longblob fields, 
but, don't think this would really help for what are really just most 
likely badly chosen copy/pasted strings from documents, with strange 
encoding, or something:

#sample code line for binary encoding into string output
s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "") 
+ ", "




Best to not pretend they're text at all.  But if your db doesn't support 
binary blobs, then use an encoding which supports all 256 values 
unambiguously, while producing printable characters.  Like uuencod, using 
module uu


You might also look into mime, where you store the encoding of the data 
with the data.  See for example mimetypes.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list



--
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Jacob Kruger


Thanks.

Makes more sense now, and yes, using 2.7 here.

Unfortunately, while could pass the binary values into blob fields well 
enough, using forms of parameterised statements, the actual generation of 
sql script text files is a step they want to work with at times, if someone 
is handling this on site, so had to work first with generating string 
values, and then handle executing those statements against a MySQL server 
later on using MySQLdb.


Stay well

Jacob Kruger
Blind Biker
Skype: BlindZA
"Roger Wilco wants to welcome you...to the space janitor's closet..."

- Original Message - 
From: "Peter Otten" <__pete...@web.de>

To: 
Sent: Wednesday, January 07, 2015 2:11 PM
Subject: Re: String character encoding when converting data from one 
type/format to another




Jacob Kruger wrote:


I'm busy using something like pyodbc to pull data out of MS access .mdb
files, and then generate .sql script files to execute against MySQL
databases using MySQLdb module, but, issue is forms of characters in
string values that don't fit inside the 0-127 range - current one seems 
to

be something like \xa3, and if I pass it through ord() function, it comes
out as character number 163.

Now issue is, yes, could just run through the hundreds of thousands of
characters in these resulting strings, and strip out any that are not
within the basic 0-127 range, but, that could result in corrupting data -
think so anyway.

Anyway, issue is, for example, if I try something like
str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or


"\xa3" already is a str; str("\xa3") is as redundant as
str(str(str("\xa3"))) ;)


str('\xa3').encode('latin7') - that last one is actually our preferred
encoding for the MySQL database - they all just tell me they can't work
with a character out of range.


encode() goes from unicode to byte; you want to convert bytes to unicode 
and

thus need decode().

In this context it is important that you tell us the Python version. In
Python 2 str.encode(encoding) is basically

str.decode("ascii").encode(encoding)

which is why you probably got a UnicodeDecodeError in the traceback:


"\xa3".encode("latin7")

Traceback (most recent call last):
 File "", line 1, in 
 File "/usr/lib/python2.7/encodings/iso8859_13.py", line 12, in encode
   return codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa3 in position 0:
ordinal not in range(128)


"\xa3".decode("latin7")

u'\xa3'

print "\xa3".decode("latin7")

£

Aside: always include the traceback in your posts -- and always read it
carefully. The fact that "latin7" is not mentioned might have given you a
hint that the problem was not what you thought it was.


Any thoughts on a sort of generic method/means to handle any/all
characters that might be out of range when having pulled them out of
something like these MS access databases?


Assuming the data in Access is not broken and that you know the encoding
decode() will work.


Another side note is for binary values that might store binary values, I
use something like the following to generate hex-based strings that work
alright when then inserting said same binary values into longblob fields,
but, don't think this would really help for what are really just most
likely badly chosen copy/pasted strings from documents, with strange
encoding, or something:
#sample code line for binary encoding into string output
s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "") 
+

", "


I would expect that you can feed bytestrings directly into blobs, without
any preparatory step. Try it, and if you get failures show us the failing
code and the corresponding traceback.

--
https://mail.python.org/mailman/listinfo/python-list



--
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Jacob Kruger


Thanks.

Yes, using python 2.7, and all you said makes sense, so will check out the 
talk, and the byte'ing, etc. (yes, bad joke, I know)


Issue with knowing encoding could just be that am pretty sure at least some 
of the data capture is done via copy/paste from one MS app to another, which 
could possibly result in a whole bunch of different character sets, etc. 
being copied across, so it comes down to that while can't control sources of 
data, need to manipulate/work with it to make it useful on our side now.


Thanks again

Jacob Kruger
Blind Biker
Skype: BlindZA
"Roger Wilco wants to welcome you...to the space janitor's closet..."

- Original Message - 
From: "Ned Batchelder" 

To: 
Sent: Wednesday, January 07, 2015 2:02 PM
Subject: Re: String character encoding when converting data from one 
type/format to another




On 1/7/15 6:04 AM, Jacob Kruger wrote:

I'm busy using something like pyodbc to pull data out of MS access .mdb
files, and then generate .sql script files to execute against MySQL
databases using MySQLdb module, but, issue is forms of characters in
string values that don't fit inside the 0-127 range - current one seems
to be something like \xa3, and if I pass it through ord() function, it
comes out as character number 163.
Now issue is, yes, could just run through the hundreds of thousands of
characters in these resulting strings, and strip out any that are not
within the basic 0-127 range, but, that could result in corrupting data
- think so anyway.


That will definitely corrupt your data, since you will be discarding data.


Anyway, issue is, for example, if I try something like
str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or
str('\xa3').encode('latin7') - that last one is actually our preferred
encoding for the MySQL database - they all just tell me they can't work
with a character out of range.


Are you using Python 2 or Python 3? This is one area where the two are 
very different.  I suspect you are on Python 2, in which case these all 
fail the same way because you are calling encode on a bytestring.  You 
can't encode a bytestring, you can only encode a Unicode string, so encode 
is helpfully trying to decode your bytestring first, using the default 
encoding (ascii), and '\xa3' is not an ascii character.


If that was confusing, this talk covers these fundamentals: 
http://bit.ly/unipain .



Any thoughts on a sort of generic method/means to handle any/all
characters that might be out of range when having pulled them out of
something like these MS access databases?


The best thing is to know what encoding was used to produce these byte 
values.  Then you can manipulate them as Unicode if you need to.  The 
second best thing is to simply pass them through as bytes.



Another side note is for binary values that might store binary values, I
use something like the following to generate hex-based strings that work
alright when then inserting said same binary values into longblob
fields, but, don't think this would really help for what are really just
most likely badly chosen copy/pasted strings from documents, with
strange encoding, or something:
#sample code line for binary encoding into string output
s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "")
+ ", "
TIA

Jacob Kruger
Blind Biker
Skype: BlindZA
"Roger Wilco wants to welcome you...to the space janitor's closet..."





--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list



--
https://mail.python.org/mailman/listinfo/python-list

Re: include "icudt53.dll, icuin53.dll, icuuc53.dll" ?

2015-01-07 Thread Chris Angelico

On Thu, Jan 8, 2015 at 12:23 AM, Timothy W. Grove  wrote:
> I think my answer is probably "Yes!" Anyone else interested, see
> http://qt-project.org/wiki/Deploying-Windows-Applications.

This is one of the disadvantages to packaging Python code up into a
monolithic executable. You end up needing quite a bit of ancillary
'stuff' to make it all work. Personally, I'd rather just distribute
the .py files, and let people download Python separately; much less
load on my servers, and they don't have to download Python twice for
two programs.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: include "icudt53.dll, icuin53.dll, icuuc53.dll" ?

2015-01-07 Thread Timothy W. Grove

I think my answer is probably "Yes!" Anyone else interested, see 
http://qt-project.org/wiki/Deploying-Windows-Applications.


Tim

On 07/01/2015 13:09, Timothy W. Grove wrote:
Does anyone have an idea of what the following .dll's are for? 
Cx_freeze includes them in a Python3.4-PyQt5 deployment adding about 
23 Mb to my application. Removing them doesn't appear to make any 
difference on my computer, but I hesitate to distribute the 
application to others without them. Thanks for any response.


Best regards,
Tim Grove

icudt53.dll
icuin53.dll
icuuc53.dll


--
https://mail.python.org/mailman/listinfo/python-list

include "icudt53.dll, icuin53.dll, icuuc53.dll" ?

2015-01-07 Thread Timothy W. Grove

Does anyone have an idea of what the following .dll's are for? Cx_freeze 
includes them in a Python3.4-PyQt5 deployment adding about 23 Mb to my 
application. Removing them doesn't appear to make any difference on my 
computer, but I hesitate to distribute the application to others without 
them. Thanks for any response.


Best regards,
Tim Grove

icudt53.dll
icuin53.dll
icuuc53.dll
--
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Chris Angelico

On Wed, Jan 7, 2015 at 11:02 PM, Ned Batchelder  wrote:
>> Any thoughts on a sort of generic method/means to handle any/all
>> characters that might be out of range when having pulled them out of
>> something like these MS access databases?
>
>
> The best thing is to know what encoding was used to produce these byte
> values.  Then you can manipulate them as Unicode if you need to.  The second
> best thing is to simply pass them through as bytes.

If you can't know for sure, you could hazard a guess. There's a good
chance that an eight-bit encoding from a Microsoft product is CP-1252.
In fact, when I interoperate with Unicode-unaware Windows programs, I
usually attempt a UTF-8 decode, and if that fails, I simply assume
CP-1252; this generally gives correct results for data coming from
US-English Windows users.

Jacob, have a look at your data. Contextually, would the '\xa3' be
likely to be a pound sign, £? Would '\x85' make sense as an ellipsis?
Would \x90, \x91, \x92, and \x93 seem to be used for quote marks? If
so, CP-1252 would be the encoding to use.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

how to create a soap enveloppe with python suds for consume a bssv web service

2015-01-07 Thread brice DORA

Hi all I am working on a app in python and I have to call a web service 
deployed on JDE (bssv). I use it for the suds lib which seems pretty friendly. 
but the problem is the JDE web service that uses bssv technology necessarily 
requires sending a soap envelope. as far as I spend my fesais parameters 
required in my suds client. My concern then is how to work around this problem, 
or is this possible to create a soap envelope with suds? thank you in advance
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Dave Angel


On 01/07/2015 06:04 AM, Jacob Kruger wrote:

I'm busy using something like pyodbc to pull data out of MS access .mdb files, 
and then generate .sql script files to execute


 against MySQL databases using MySQLdb module, but, issue is forms of 
characters in string values that don't fit inside


 the 0-127 range - current one seems to be something like \xa3, and if 
I pass it through ord() function,


 it comes out as character number 163.

First question, of course is what version of Python.  Clearly, you're 
not using Python 3.x, so I'll assume 2.7.  But you really should specify 
it in your query.


Next question is what OS you're using.  You're reading .mdb files, which 
are most likely created in Windows, but that doesn't guarantee you're 
actually using Windows to do this conversion.





Now issue is, yes, could just run through the hundreds of thousands of 
characters in these resulting strings, and strip out any that are not within 
the basic 0-127 range, but, that could result in corrupting data - think so 
anyway.

Anyway, issue is, for example, if I try something like 
str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or 
str('\xa3').encode('latin7') - that last one is actually our preferred encoding 
for the MySQL database - they all just tell me they can't work with a character 
out of range.



That's because your input data isn't ASCII.  So before you encode it, 
you have to decode it.  Any idea what encoding it's already in?  Maybe 
it's in latin1, which permits all 256 values.  Or utf-8, which permits a 
few hundred thousand values, but uses multiple bytes for any of those 
over 127.  Or any of hundreds of other encodings.


Does an .mdb file have a field specifying what encoding was used?  Or do 
you have some other external knowledge?


If you don't know what encoding it's currently in, you'll have to guess, 
and the guess you're using so far is ASCII, which you know is false.


As for the encoding you should actually use in the database, that almost 
certainly ought to be utf-8, which supports far more international 
characters than latin1.  And make sure the database has a way to tell 
the future user what encoding you picked.



Any thoughts on a sort of generic method/means to handle any/all characters 
that might be out of range when having pulled them out of something like these 
MS access databases?


The only invalid characters are those which aren't valid in the encoding 
used.  Those can probably be safely converted to "?" or something similar.




Another side note is for binary values that might store binary values, I use 
something like the following to generate hex-based strings that work alright 
when then inserting said same binary values into longblob fields, but, don't 
think this would really help for what are really just most likely badly chosen 
copy/pasted strings from documents, with strange encoding, or something:
#sample code line for binary encoding into string output
s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "") + ", "



Best to not pretend they're text at all.  But if your db doesn't support 
binary blobs, then use an encoding which supports all 256 values 
unambiguously, while producing printable characters.  Like uuencod, 
using module uu


You might also look into mime, where you store the encoding of the data 
with the data.  See for example mimetypes.




--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Peter Otten

Jacob Kruger wrote:

> I'm busy using something like pyodbc to pull data out of MS access .mdb
> files, and then generate .sql script files to execute against MySQL
> databases using MySQLdb module, but, issue is forms of characters in
> string values that don't fit inside the 0-127 range - current one seems to
> be something like \xa3, and if I pass it through ord() function, it comes
> out as character number 163.
> 
> Now issue is, yes, could just run through the hundreds of thousands of
> characters in these resulting strings, and strip out any that are not
> within the basic 0-127 range, but, that could result in corrupting data -
> think so anyway.
> 
> Anyway, issue is, for example, if I try something like
> str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or

"\xa3" already is a str; str("\xa3") is as redundant as 
str(str(str("\xa3"))) ;)

> str('\xa3').encode('latin7') - that last one is actually our preferred
> encoding for the MySQL database - they all just tell me they can't work
> with a character out of range.

encode() goes from unicode to byte; you want to convert bytes to unicode and 
thus need decode().

In this context it is important that you tell us the Python version. In 
Python 2 str.encode(encoding) is basically 

str.decode("ascii").encode(encoding)

which is why you probably got a UnicodeDecodeError in the traceback:

>>> "\xa3".encode("latin7")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python2.7/encodings/iso8859_13.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa3 in position 0: 
ordinal not in range(128)

>>> "\xa3".decode("latin7")
u'\xa3'
>>> print "\xa3".decode("latin7")
£

Aside: always include the traceback in your posts -- and always read it 
carefully. The fact that "latin7" is not mentioned might have given you a 
hint that the problem was not what you thought it was.

> Any thoughts on a sort of generic method/means to handle any/all
> characters that might be out of range when having pulled them out of
> something like these MS access databases?

Assuming the data in Access is not broken and that you know the encoding
decode() will work.

> Another side note is for binary values that might store binary values, I
> use something like the following to generate hex-based strings that work
> alright when then inserting said same binary values into longblob fields,
> but, don't think this would really help for what are really just most
> likely badly chosen copy/pasted strings from documents, with strange
> encoding, or something:
> #sample code line for binary encoding into string output
> s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "") +
> ", "

I would expect that you can feed bytestrings directly into blobs, without 
any preparatory step. Try it, and if you get failures show us the failing 
code and the corresponding traceback.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: String character encoding when converting data from one type/format to another

2015-01-07 Thread Ned Batchelder


On 1/7/15 6:04 AM, Jacob Kruger wrote:

I'm busy using something like pyodbc to pull data out of MS access .mdb
files, and then generate .sql script files to execute against MySQL
databases using MySQLdb module, but, issue is forms of characters in
string values that don't fit inside the 0-127 range - current one seems
to be something like \xa3, and if I pass it through ord() function, it
comes out as character number 163.
Now issue is, yes, could just run through the hundreds of thousands of
characters in these resulting strings, and strip out any that are not
within the basic 0-127 range, but, that could result in corrupting data
- think so anyway.


That will definitely corrupt your data, since you will be discarding data.


Anyway, issue is, for example, if I try something like
str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or
str('\xa3').encode('latin7') - that last one is actually our preferred
encoding for the MySQL database - they all just tell me they can't work
with a character out of range.


Are you using Python 2 or Python 3? This is one area where the two are 
very different.  I suspect you are on Python 2, in which case these all 
fail the same way because you are calling encode on a bytestring.  You 
can't encode a bytestring, you can only encode a Unicode string, so 
encode is helpfully trying to decode your bytestring first, using the 
default encoding (ascii), and '\xa3' is not an ascii character.


If that was confusing, this talk covers these fundamentals: 
http://bit.ly/unipain .



Any thoughts on a sort of generic method/means to handle any/all
characters that might be out of range when having pulled them out of
something like these MS access databases?


The best thing is to know what encoding was used to produce these byte 
values.  Then you can manipulate them as Unicode if you need to.  The 
second best thing is to simply pass them through as bytes.



Another side note is for binary values that might store binary values, I
use something like the following to generate hex-based strings that work
alright when then inserting said same binary values into longblob
fields, but, don't think this would really help for what are really just
most likely badly chosen copy/pasted strings from documents, with
strange encoding, or something:
#sample code line for binary encoding into string output
s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "")
+ ", "
TIA

Jacob Kruger
Blind Biker
Skype: BlindZA
"Roger Wilco wants to welcome you...to the space janitor's closet..."





--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list

String character encoding when converting data from one type/format to another

2015-01-07 Thread Jacob Kruger

I'm busy using something like pyodbc to pull data out of MS access .mdb files, 
and then generate .sql script files to execute against MySQL databases using 
MySQLdb module, but, issue is forms of characters in string values that don't 
fit inside the 0-127 range - current one seems to be something like \xa3, and 
if I pass it through ord() function, it comes out as character number 163.

Now issue is, yes, could just run through the hundreds of thousands of 
characters in these resulting strings, and strip out any that are not within 
the basic 0-127 range, but, that could result in corrupting data - think so 
anyway.

Anyway, issue is, for example, if I try something like 
str('\xa3').encode('utf-8') or str('\xa3').encode('ascii'), or 
str('\xa3').encode('latin7') - that last one is actually our preferred encoding 
for the MySQL database - they all just tell me they can't work with a character 
out of range.

Any thoughts on a sort of generic method/means to handle any/all characters 
that might be out of range when having pulled them out of something like these 
MS access databases?

Another side note is for binary values that might store binary values, I use 
something like the following to generate hex-based strings that work alright 
when then inserting said same binary values into longblob fields, but, don't 
think this would really help for what are really just most likely badly chosen 
copy/pasted strings from documents, with strange encoding, or something:
#sample code line for binary encoding into string output
s_values += "0x" + str(l_data[J][I]).encode("hex").replace("\\", "") + ", "

TIA

Jacob Kruger
Blind Biker
Skype: BlindZA
"Roger Wilco wants to welcome you...to the space janitor's closet..."
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-07 Thread Marko Rauhamaa

Steven D'Aprano :

> int 0 is a falsey object
> NoneType None is a falsey object
> str 'hello' is a truthy object
> float 23.0 is a truthy object

I prefer the Scheme way:

   #f is a falsey object

   everything else is a truthy object


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-07 Thread Chris Angelico

On Wed, Jan 7, 2015 at 7:10 PM, Steven D'Aprano  wrote:
> ou can make an object which quacks like a bool
> (or list, tuple, dict, bool, str...), swims like a bool...

Huh. You mean like an Olympic swimming bool?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-07 Thread Steven D'Aprano

On Tue, 06 Jan 2015 18:01:48 -0800, Andrew Robinson wrote:

> but if you can't subclass a built in type -- you can't duck type it --
> for I seem to recall that Python forbids duck typing any built in class
> nut not subclasses.

I fear that you have completely misunderstood the nature of duck-typing.

The name comes from the phrase "if it walks like a duck and swims like a 
duck and quacks like a duck, it might as well be a duck". The idea with 
duck-typing is that you don't care whether an object *actually is* a bool 
(list, float, dict, etc.) but only whether it offers the same interface 
as a bool. Not necessarily the entire interface, but just the parts you 
need: if you need something that quacks, you shouldn't care whether or 
not it has a swim() method.

In the case of bool, literally every object in Python can "quack like a 
bool", so to speak, unless you deliberately go out of your way to prevent 
it. Here is an example of duck-typing non-bools in a boolean context:

py> values = [0, None, "hello", 23.0, TypeError, {'a': 42}, {}, len]
py> for obj in values:
... typename = type(obj).__name__
... if obj:
... print "%s %r is a truthy object" % (typename, obj)
... else:
... print "%s %r is a falsey object" % (typename, obj)
... 
int 0 is a falsey object
NoneType None is a falsey object
str 'hello' is a truthy object
float 23.0 is a truthy object
type  is a truthy object
dict {'a': 42} is a truthy object
dict {} is a falsey object
builtin_function_or_method  is a truthy object

You can see I didn't convert obj to a bool, I just used obj in an if-
statement as if it were a bool. That's duck-typing.

[...]
>>> Why this is so important to Guido, I don't know ... but it's making it
>>> VERY difficult to add named aliases of False which will still be
>>> detected as False and type-checkable as a bool.

That part is absolutely wrong. Given these three named aliases of False, 
I challenge you to find any way to distinguish them from the actual False:

NOT_TRUE = False
NICHTS = False
WRONG = False

That's a safe bet, of course, because those three aliases are just names 
bound to the False object. You can't distinguish the WRONG object from 
the False object because they are the same object.

(You can, however, re-bind the WRONG *name* to a different object:

WRONG = "something else"

But that is another story.)

>>> If my objects don't
>>> type check right -- they will likely break some people's legacy
>>> code... 

There's nothing you can do about that. You can't control what stupid 
things people might choose to do:

if str(flag).lower() == 'false':
print "flag is false"

All you can do is offer to support some set of operations. It's up to 
your users whether or not they will limit themselves to the operations 
you promise to support. You can make an object which quacks like a bool 
(or list, tuple, dict, bool, str...), swims like a bool and walks like a 
bool, but ultimately Python's introspection powers are too strong for you 
to fool everybody that it *actually is* a bool.

And you know, that's actually a good thing.

[...]
> In 'C++' I can define a subclass without ever instantiating it; and I
> can define static member functions of the subclass that operate even
> when there exists not a single instance of the class; and I can typecast
> an instance of the base class as being an instance of the subclass.

And in Python, we can do all those things too, except that what you call 
"static member function" we call "class method". But we hardly ever 
bother, because it's simply not needed or is an unnatural way to do 
things in Python. But to prove it can be done:

from abc import ABCMeta

class MyFloat(float):
__metaclass__ = ABCMeta
@classmethod
def __subclasshook__(cls, C):
if cls is MyFloat:
if C is float: return True
return NotImplemented
@classmethod
def spam(cls, n):
return ' '.join(["spam"]*n)

MyFloat.register(float)

And in use:

py> MyFloat.spam(5)
'spam spam spam spam spam'
py> isinstance(23.0, MyFloat)
True

Should you do this? Almost certainly not.

> So
> -- (against what Guido seems to have considered) I can define a function
> anywhere which returns my new subclass object as it's return value
> without ever instantiating the subclass -- because my new function can
> simply return a typecasting of a base class instance;  The user of my
> function would never need to know that the subclass itself was never
> instantiated... for they would only be allowed to call static member
> functions on the subclass anyway, but all the usual methods found in the
> superclass(es) would still be available to them.  All the benefits of
> subclassing still exist, without ever needing to violate the singleton
> character of the base class instance.

This is all very well and good, but what exactly is the point of it all? 
What is the *actual problem* this convoluted mess is supposed to solve?

An awful l

39 matches

Mail list logo