Re: Pickle based workflow - looking for advice

2015-04-14 Thread Fabien

On 14.04.2015 06:05, Chris Angelico wrote:

Not sure what you mean, here. Any given file will be written by
exactly one process? No possible problem. Multiprocessing within one
application doesn't change that.


yes that's what I meant. Thanks!
--
https://mail.python.org/mailman/listinfo/python-list


How To Convert Pdf Files To XML Spreadsheet

2015-04-14 Thread traciscrouch
A lot will be helped by an effective solution to change PDF to workplace
document. For those who have over the casual have to convert PDF to Office
records, installing and getting aPDF Converteris a fast and basic strategy.
Just a few seconds are installed in by this system. It could convert PDF to
PowerPoint, Term, HTML format. Pdf To XML transformation function is going
to be built-into this converter shortly.

When you've done picking your controls and Calibre press Alright may convert
the guide. This could have a short while, with respect Free Pdf To XML the
PDF's measurement. When the conversion seems to be getting too much time,
Show job details can click to learn more to the development.

https://sourceforge.net/projects/pdftoxmlconverter/



--
View this message in context: 
http://python.6.x6.nabble.com/How-To-Convert-Pdf-Files-To-XML-Spreadsheet-tp5092235.html
Sent from the Python - python-list mailing list archive at Nabble.com.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-14 Thread Rustom Mody
On Tuesday, April 14, 2015 at 8:12:17 AM UTC+5:30, Paul Rubin wrote:
> Steven D'Aprano  writes:
> > http://code.activestate.com/recipes/577821-integer-square-root-function/
> 
> The methods there are more "mathematical" but probably slower than what
> I posted.
> 
> Just for laughs, this prints the first 20 primes using Python 3's 
> "yield from":
> 
> import itertools
> 
> def sieve(ps):
> p = ps.__next__()
> yield p
> yield from sieve(a for a in ps if a % p != 0)
> 
> primes = sieve(itertools.count(2))
> print(list(itertools.islice(primes,20)))
> 
> It's not that practical above a few hundred primes, probably.

Upto 490 its instantaneous
At 500 recursion stack overflows
[Yeah one can set the recursion limit]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How To Convert Pdf Files To XML Spreadsheet

2015-04-14 Thread David H. Lipman

From: "traciscrouch" 


A lot will be helped by an effective solution to change PDF to workplace


Once again this project was spammed.

All users are encouraged to file a complaint with SourceForge.
ab...@sourceforge.net

They have also been caught spamming Web Forums.


--
Dave
Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
http://www.pctipp.ch/downloads/dl/35905.asp
--
https://mail.python.org/mailman/listinfo/python-list


Is there functions like the luaL_loadfile and luaL_loadbuffer in lua source to dump the lua scripts in Python's source?

2015-04-14 Thread zhihao chen
HI,I  want to dump the python script when some application(android's app)
use the python engine.they embedding python into its app.


 I can dump the script from an app which use the lua engine through the
luaL_loadbuffer or luaL_loadfile (just hook this function,and dump the lua
script by the (char *)buffer and size_t )

And,I want to know that:

Is there functions like the luaL_loadfile or luaL_loadbuffer on python's
source to read the python's file,so I cann't through this function to get
the (char *)buffer and size so to get the *.py or *.pyc ?

In other words,which C/C++ function contribute to load the all python's
script or pyc,I want to dump the scripts in that.

Thank you  very much.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How To Convert Pdf Files To XML Spreadsheet

2015-04-14 Thread Gene Heskett
On Tuesday 14 April 2015 07:54:58 David H. Lipman wrote:
> From: "traciscrouch" 
>
> > A lot will be helped by an effective solution to change PDF to
> > workplace
>
> Once again this project was spammed.
>
> All users are encouraged to file a complaint with SourceForge.
> ab...@sourceforge.net
>
Getting them to accept and act on the complaint was not possible the last 
time I bitched.

Get you projects off of sourceforge, and filter the jerks.

> They have also been caught spamming Web Forums.
>
>
> --
> Dave
> Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
> http://www.pctipp.ch/downloads/dl/35905.asp

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there functions like the luaL_loadfile and luaL_loadbuffer in lua source to dump the lua scripts in Python's source?

2015-04-14 Thread Dave Angel

On 04/14/2015 08:07 AM, zhihao chen wrote:

HI,I  want to dump the python script when some application(android's app)
use the python engine.they embedding python into its app.


  I can dump the script from an app which use the lua engine through the
luaL_loadbuffer or luaL_loadfile (just hook this function,and dump the lua
script by the (char *)buffer and size_t )

And,I want to know that:

Is there functions like the luaL_loadfile or luaL_loadbuffer on python's
source to read the python's file,so I cann't through this function to get
the (char *)buffer and size so to get the *.py or *.pyc ?

In other words,which C/C++ function contribute to load the all python's
script or pyc,I want to dump the scripts in that.



I know nothing about lua, but the __file__ attribute on most user 
modules will tell you the filename it's loaded from.  Naturally, that 
assumes it actually came from a file, and a few other things.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Vincent Davis
I had been reading in a file like so. (python 3)
with open(dfile, 'rb') as f:
for line in f:

​line
 = line.decode('utf-8', 'ignore').split(',')

​How can I ​do accomplish decode('utf-8', 'ignore') when reading with
 DictReader()


Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pickle based workflow - looking for advice

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 05:58 pm, Fabien wrote:

> On 14.04.2015 06:05, Chris Angelico wrote:
>> Not sure what you mean, here. Any given file will be written by
>> exactly one process? No possible problem. Multiprocessing within one
>> application doesn't change that.
> 
> yes that's what I meant. Thanks!

It's not that simple though. If you require files to be written in precisely
a certain order, then parallel processing requires synchronisation.

Suppose you write A, then B, then C, then D, each in it's own process (or
thread). So the B process has to wait for A to finish, the C process has to
wait for B to finish, and so on. Otherwise you could find yourself with C
reading the data from B before B is finished writing it.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Michiel Overtoom

> ​How can I ​do accomplish decode('utf-8', 'ignore') when reading with 
> DictReader()

Have you tried using the csv module in conjunction with codecs?
There shouldn't be any need to 'ignore' characters.


import csv
import codecs

rs = csv.DictReader(codecs.open(fn, "rbU", "utf8"))
for r in rs:
print(r)

Greetings,

-- 
"You can't actually make computers run faster, you can only make them do less." 
- RiderOfGiraffes

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there functions like the luaL_loadfile and luaL_loadbuffer in lua source to dump the lua scripts in Python's source?

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 10:07 pm, zhihao chen wrote:

> HI,I  want to dump the python script when some application(android's app)
> use the python engine.they embedding python into its app.
> 


*If* the source code is still available, which it may not be, then
inspect.getsource can find it:

https://docs.python.org/2/library/inspect.html


For the source code to be dumped, it needs to be Python code (not a built-in
or C extension object), the .py file still needs to be available where
Python can see it (not just the .pyc file), and you need to have read
permission.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 10:54 pm, Vincent Davis wrote:

> I had been reading in a file like so. (python 3)
> with open(dfile, 'rb') as f:
> for line in f:
> 
> line
>  = line.decode('utf-8', 'ignore').split(',')
> 
> How can I ​do accomplish decode('utf-8', 'ignore') when reading with
>  DictReader()


Which DictReader? Do you mean the one in the csv module? I will assume so.

I haven't tried it, but I think something like this will work:


# untested
with open(dfile, 'r', encoding='utf-8', errors='ignore', newline='') as f:
reader = csv.DictReader(f)
for row in reader:
print(row['fieldname'])



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 08:35 pm, Rustom Mody wrote:

> On Tuesday, April 14, 2015 at 8:12:17 AM UTC+5:30, Paul Rubin wrote:
>> Steven D'Aprano  writes:
>> >
http://code.activestate.com/recipes/577821-integer-square-root-function/
>> 
>> The methods there are more "mathematical" but probably slower than what
>> I posted.
>> 
>> Just for laughs, this prints the first 20 primes using Python 3's
>> "yield from":
>> 
>> import itertools
>> 
>> def sieve(ps):
>> p = ps.__next__()
>> yield p
>> yield from sieve(a for a in ps if a % p != 0)
>> 
>> primes = sieve(itertools.count(2))
>> print(list(itertools.islice(primes,20)))
>> 
>> It's not that practical above a few hundred primes, probably.
> 
> Upto 490 its instantaneous

Not really instantaneous.

py> with Stopwatch():
... x = list(itertools.islice(sieve(itertools.count(2)), 499))
...
time taken: 0.086171 seconds
py> from pyprimes import primes
py> with Stopwatch():
... y = list(itertools.islice(primes(), 499))
...
time taken: 0.002802 seconds
py> x == y
True


I have to admit, I expected it to be significantly slower than it actually
is. Just goes to show, I've been using Python for 15+ years and my
intuition as to what is fast and what is slow is still mediocre at best.

Any beginner who thinks they can optimize code without measuring it first is
deluding themselves.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Using Dictionary

2015-04-14 Thread Pippo
How can I use dictionary to save the following information?

#C[Health]
#P[Information]
#ST[genetic information]
#C[oral | (recorded in (any form | medium))]
#C[Is created or received by]
#A[health care provider | health plan | public health authority | employer | 
life insurer | school | university | or health care clearinghouse]
#C[Relates to]
#C[the past, present, or future physical | mental health | condition of an 
individual]
#C[the provision of health care to an individual]
#C[the past, present, or future payment for the provision of health care to an 
individual]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Vincent Davis
> Which DictReader? Do you mean the one in the csv module? I will assume so.
>
​yes.​


>
> # untested
> with open(dfile, 'r', encoding='utf-8', errors='ignore', newline='') as f:
> reader = csv.DictReader(f)
> for row in reader:
> print(row['fieldname'])
>

What you have seems to work, now I need to go find my strange symbols that
are not ​'utf-8' and see what happens
I was thought, that I had to open with 'rb' to use ​encoding?


Vincent Davis
-- 
https://mail.python.org/mailman/listinfo/python-list


creating a graph out of text

2015-04-14 Thread Pippo
Hi,

I am not sure why my code doesn't show the graph. It clearly creates the right 
nodes and edges but it doesn't show them. Any idea?

mport re 
from Tkinter import * 
import tkFileDialog 
import testpattern 
import networkx as nx 
import matplotlib.pyplot as plt 

patternresult =[] 
nodes = {} 
edges = {} 

pattern = ['(#P\[\w*\])', '(#C\[\w*[\s\(\w\,\s\|\)]+\])', 
   '(#ST\[\w*[\s\(\w\,\s\|\)]+\])', '(#A\[\w*[\s\(\w\,\s\|\)]+\])'] 


patternresult = testpattern.patternfinder() 

G=nx.Graph() 


for patres in patternresult: 
G.add_node(patres) 
if patres[1] == "P": 
first_node = patres 


for patres in patternresult: 

if patres[1:3] == "ST": 
G.add_edge(first_node, patres, label="Sub-type") 
elif patres[1:2] == "C": 
G.add_edge(first_node, patres, label="Constraint_by") 
elif patres[1:3] == "A": 
G.add_edge(first_node, patres, label="Created_by") 

#graph = G.edge() 
pos = nx.shell_layout(G) 
#nx.draw(G, pos) 


nx.draw_networkx_nodes(G, pos, nodelist=None, 
   node_size=300, node_color='r', 
   node_shape='o', alpha=1.0, 
   cmap=None, vmin=None, vmax=None, 
   ax=None, linewidths=None) 
nx.draw_networkx_edges(G, pos, edgelist=None, width=1.0, edge_color='k', 
   style='solid',alpha=None, edge_cmap=None, 
   edge_vmin=None, edge_vmax=None,ax=None, arrows=True) 
nx.draw_networkx_labels(G, pos,font_size=10, font_family='sans-serif') 

plt.show() 

#plt.savefig("path.png") 

I think the problem starts from nx.draw

This is what I want to create graph for:
#C[Health]
#P[Information]
#ST[genetic information]
#C[oral | (recorded in (any form | medium))]
#C[Is created or received by]
#A[health care provider | health plan | public health authority | employer | 
life insurer | school | university | or health care clearinghouse]
#C[Relates to]
#C[the past, present, or future physical | mental health | condition of an 
individual]
#C[the provision of health care to an individual]
#C[the past, present, or future payment for the provision of health care to an 
individual]

This also shows all of the edges that are created...

{'#C[the past, present, or future payment for the provision of health care to 
an individual]': 1, '#A[health care provider | health plan | public health 
authority | employer | life insurer | school | university | or health care 
clearinghouse]': 0, '#C[Relates to]': 1, '#C[the past, present, or future 
physical | mental health | condition of an individual]': 1, '#C[Is created or 
received by]': 1, '#C[the provision of health care to an individual]': 1, 
'#C[Health]': 1, '#C[oral | (recorded in (any form | medium))]': 1, 
'#ST[genetic information]': 1, '#P[Information]': 8}
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 12:42 pm, Paul Rubin wrote:

> Steven D'Aprano  writes:
>> http://code.activestate.com/recipes/577821-integer-square-root-function/
> 
> The methods there are more "mathematical" but probably slower than what
> I posted.
> 
> Just for laughs, this prints the first 20 primes using Python 3's
> "yield from":
> 
> import itertools
> 
> def sieve(ps):
> p = ps.__next__()
> yield p
> yield from sieve(a for a in ps if a % p != 0)
> 
> primes = sieve(itertools.count(2))
> print(list(itertools.islice(primes,20)))
> 
> It's not that practical above a few hundred primes, probably.

Oh! I thought that looked familiar... that's a recursive version of David
Turner's implementation of Euler's sieve. It is very popular in Haskell
circles, usually written as:

primes = sieve [2..]
sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p > 0]


In her paper http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf, Melissa
O'Neill calls this the "Sleight on Eratosthenes".

According to O'Neill, it has asymptotic performance of O(N**2/(log N)**2),
which means that it will perform very poorly beyond a few hundred primes.

Here is an iterative version:

def turner():
nums = itertools.count(2)
while True:
prime = next(nums)
yield prime
nums = filter(lambda v, p=prime: (v % p) != 0, nums)


On my computer, your recursive version is about 35% slower than the
iterative version over the first 499 primes.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there functions like the luaL_loadfile and luaL_loadbuffer in lua source to dump the lua scripts in Python's source?

2015-04-14 Thread Chris Angelico
On Tue, Apr 14, 2015 at 11:14 PM, Steven D'Aprano
 wrote:
> For the source code to be dumped, it needs to be Python code (not a built-in
> or C extension object), the .py file still needs to be available where
> Python can see it (not just the .pyc file), and you need to have read
> permission.

And it has to have not been edited since the program started. I've had
some VERY strange-looking backtraces when I've been in the middle of
editing something, and maybe moved some code around, or just
added/deleted a block higher up in the file...

But if all of that, then yes, Python does help you figure out which
file to go look for. Love that feature.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using Dictionary

2015-04-14 Thread Michiel Overtoom

On Apr 14, 2015, at 15:34, Pippo wrote:

> How can I use dictionary to save the following information?

What a curious question. The purpose of a dictionary is not to save 
information, but to store data as a key -> value mapping:

telbook = {}
telbook["jan"] = "0627832873"
telbook["mary"] = "050-932390"

Or do you mean 'store' when you mention 'save'?

...and to store a bunch of lines you don't need a dictionary either. A list 
would do:

info = [
"#C[Health]",
"#P[Information]",
"#ST[genetic information]",
]

Greetings,

-- 
"You can't actually make computers run faster, you can only make them do less." 
- RiderOfGiraffes

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using Dictionary

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 11:34 pm, Pippo wrote:

> How can I use dictionary to save the following information?
> 
> #C[Health]
> #P[Information]
> #ST[genetic information]
> #C[oral | (recorded in (any form | medium))]
> #C[Is created or received by]
> #A[health care provider | health plan | public health authority | employer
> #| life insurer | school | university | or health care clearinghouse]
> #C[Relates to] C[the past, present, or future physical | mental health |
> #condition of an individual] C[the provision of health care to an
> #individual] C[the past, present, or future payment for the provision of
> #health care to an individual]


I don't understand the question. Can you explain what you mean by "save"?
Which part of the information? All of it?

text = """
#C[Health]
#P[Information]
#ST[genetic information]
#C[oral | (recorded in (any form | medium))]
#C[Is created or received by]
#A[health care provider | health plan | public health authority | employer |
life insurer | school | university | or health care clearinghouse]
#C[Relates to]
#C[the past, present, or future physical | mental health | condition of an
individual]
#C[the provision of health care to an individual]
#C[the past, present, or future payment for the provision of health care to
an individual]
"""
adict = {None: text}

Now you have the information inside a dict. Is that what you want? If not,
can you show an example of what result you hope to get at the end.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pickle based workflow - looking for advice

2015-04-14 Thread Chris Angelico
On Tue, Apr 14, 2015 at 11:08 PM, Steven D'Aprano
 wrote:
> On Tue, 14 Apr 2015 05:58 pm, Fabien wrote:
>
>> On 14.04.2015 06:05, Chris Angelico wrote:
>>> Not sure what you mean, here. Any given file will be written by
>>> exactly one process? No possible problem. Multiprocessing within one
>>> application doesn't change that.
>>
>> yes that's what I meant. Thanks!
>
> It's not that simple though. If you require files to be written in precisely
> a certain order, then parallel processing requires synchronisation.
>
> Suppose you write A, then B, then C, then D, each in it's own process (or
> thread). So the B process has to wait for A to finish, the C process has to
> wait for B to finish, and so on. Otherwise you could find yourself with C
> reading the data from B before B is finished writing it.

Sure, which is a matter of writer/reader conflicts on a single file -
nothing to do with "writing multiple files simultaneously" which was
the question raised.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 11:37 pm, Vincent Davis wrote:

>> Which DictReader? Do you mean the one in the csv module? I will assume
>> so.
>>
> yes.
> 
> 
>>
>> # untested
>> with open(dfile, 'r', encoding='utf-8', errors='ignore', newline='') as
>> f:
>> reader = csv.DictReader(f)
>> for row in reader:
>> print(row['fieldname'])
>>
> 
> What you have seems to work, now I need to go find my strange symbols that
> are not ​'utf-8' and see what happens
> I was thought, that I had to open with 'rb' to use ​encoding?

No, in Python 3 the rules are:

'rb' reads in binary mode, returns raw bytes without doing any decoding;

'r' reads in text mode, returns Unicode text, using the codec/encoding
specified. By default, if no encoding is specified, I think UTF-8 is used,
but it may depend on the platform.


If you are getting decoding errors when reading the file, it is possible
that the file isn't actually UTF-8. One test you can do:

with open(dfile, 'rb') as f:
for line in f:
try:
s = line.decode('utf-8', 'strict')
except UnicodeDecodeError as err:
print(err)

If you need help deciphering the errors, please copy and paste them here and
we'll see what we can do.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pickle based workflow - looking for advice

2015-04-14 Thread Steven D'Aprano
On Tue, 14 Apr 2015 11:45 pm, Chris Angelico wrote:

> On Tue, Apr 14, 2015 at 11:08 PM, Steven D'Aprano
>  wrote:
>> On Tue, 14 Apr 2015 05:58 pm, Fabien wrote:
>>
>>> On 14.04.2015 06:05, Chris Angelico wrote:
 Not sure what you mean, here. Any given file will be written by
 exactly one process? No possible problem. Multiprocessing within one
 application doesn't change that.
>>>
>>> yes that's what I meant. Thanks!
>>
>> It's not that simple though. If you require files to be written in
>> precisely a certain order, then parallel processing requires
>> synchronisation.
>>
>> Suppose you write A, then B, then C, then D, each in it's own process (or
>> thread). So the B process has to wait for A to finish, the C process has
>> to wait for B to finish, and so on. Otherwise you could find yourself
>> with C reading the data from B before B is finished writing it.
> 
> Sure, which is a matter of writer/reader conflicts on a single file -
> nothing to do with "writing multiple files simultaneously" which was
> the question raised.

Fabien: "So I'm trying to crack open an old grenade I found, and I was
wondering if I need a ball-peen hammer or whether a regular hammer will be
okay."

You: "Oh, a regular hammer will be fine."

Me: "Just a minute. You're hitting a grenade with a hammer hard enough to
crack the case. That could be bad. It might explode."

You: "Sure, but the OP never asked about that. He just asked if the kind of
hammer makes a difference."

:-P


Seriously though, the OP did specify in his first post that there is at
least one dependency of the "B depends on A finishing first" kind. I
understood that A writes to a file, B reads that file and writes to a new
file, C reads that file and writes to yet another file, and so on. In which
case, *writing* the files is the least of his problems, it's the exploding
grenade, er, synchronisation problems that will get him.

:-)


Apart from "embarrassingly parallel" problems, thread- and
multiprocessing-based workflows are often trickier than they may seen ahead
of time, and may even be slower than a purely sequential algorithm:

http://en.wikipedia.org/wiki/Parallel_slowdown
http://en.wikipedia.org/wiki/Embarrassingly_parallel



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using Dictionary

2015-04-14 Thread Pippo
On Tuesday, 14 April 2015 09:54:46 UTC-4, Michiel Overtoom  wrote:
> On Apr 14, 2015, at 15:34, Pippo wrote:
> 
> > How can I use dictionary to save the following information?
> 
> What a curious question. The purpose of a dictionary is not to save 
> information, but to store data as a key -> value mapping:
> 
> telbook = {}
> telbook["jan"] = "0627832873"
> telbook["mary"] = "050-932390"
> 
> Or do you mean 'store' when you mention 'save'?
> 
> ...and to store a bunch of lines you don't need a dictionary either. A list 
> would do:
> 
> info = [
> "#C[Health]",
> "#P[Information]",
> "#ST[genetic information]",
> ]
> 
> Greetings,
> 
> -- 
> "You can't actually make computers run faster, you can only make them do 
> less." - RiderOfGiraffes

Thanks. I wanted to store them. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pickle based workflow - looking for advice

2015-04-14 Thread Chris Angelico
On Wed, Apr 15, 2015 at 12:14 AM, Steven D'Aprano
 wrote:
> On Tue, 14 Apr 2015 11:45 pm, Chris Angelico wrote:
>
>> On Tue, Apr 14, 2015 at 11:08 PM, Steven D'Aprano
>>  wrote:
>>> On Tue, 14 Apr 2015 05:58 pm, Fabien wrote:
>>>
 On 14.04.2015 06:05, Chris Angelico wrote:
> Not sure what you mean, here. Any given file will be written by
> exactly one process? No possible problem. Multiprocessing within one
> application doesn't change that.

 yes that's what I meant. Thanks!
>>>
>>> It's not that simple though. If you require files to be written in
>>> precisely a certain order, then parallel processing requires
>>> synchronisation.
>>>
>>> Suppose you write A, then B, then C, then D, each in it's own process (or
>>> thread). So the B process has to wait for A to finish, the C process has
>>> to wait for B to finish, and so on. Otherwise you could find yourself
>>> with C reading the data from B before B is finished writing it.
>>
>> Sure, which is a matter of writer/reader conflicts on a single file -
>> nothing to do with "writing multiple files simultaneously" which was
>> the question raised.
>
> Fabien: "So I'm trying to crack open an old grenade I found, and I was
> wondering if I need a ball-peen hammer or whether a regular hammer will be
> okay."
>
> You: "Oh, a regular hammer will be fine."
>
> Me: "Just a minute. You're hitting a grenade with a hammer hard enough to
> crack the case. That could be bad. It might explode."
>
> You: "Sure, but the OP never asked about that. He just asked if the kind of
> hammer makes a difference."
>
> :-P

Heh, fair point. This list is superb at answering the questions people
never even knew to ask.

> Seriously though, the OP did specify in his first post that there is at
> least one dependency of the "B depends on A finishing first" kind. I
> understood that A writes to a file, B reads that file and writes to a new
> file, C reads that file and writes to yet another file, and so on. In which
> case, *writing* the files is the least of his problems, it's the exploding
> grenade, er, synchronisation problems that will get him.
>
> :-)
>
> Apart from "embarrassingly parallel" problems, thread- and
> multiprocessing-based workflows are often trickier than they may seen ahead
> of time, and may even be slower than a purely sequential algorithm:

Yep. The way I read the OP's problem, it's easiest thought of as a
generic request-response system - same as most internet servers.
Basically, you have a piece of code that reacts to an incoming
request, and produces some form of response, then goes back and looks
for the next request. Whether you actually code along those lines or
not, it's a reasonable way to get your head around it.

If you _do_ code it that way, one big benefit is that you effectively
have a multiprocessable state machine; you can fork out to N processes
to take advantage of your CPU cores, or run in a single process for
debugging, and none of the code cares in the slightest.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: using DictReader() with .decode('utf-8', 'ignore')

2015-04-14 Thread Vincent Davis
On Tue, Apr 14, 2015 at 7:48 AM, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:

> with open(dfile, 'rb') as f:
> for line in f:
> try:
> s = line.decode('utf-8', 'strict')
> except UnicodeDecodeError as err:
> print(err)
>
> If you need help deciphering the errors, please copy and paste them here
> and
> we'll see what we can do.


Below are the errors. I knew about these and I think the correct encoding
is windows-1252. I will paste some code and output at the end of this email
that prints the offending column in the line. These are very likely errors,
and so I what to remove them. I am reading this csv into django sqlite3 db.
What is strange to me is that using
​"​
with open(dfile, 'r', encoding='utf-8', errors='ignore', newline='')
​"​
 does not seem to remove these
​, it seems to correctly save them to the db which I don't understand.​
​


'utf-8' codec can't decode byte 0xa6 in position 368: invalid start byte
'utf-8' codec can't decode byte 0xac in position 223: invalid start byte
'utf-8' codec can't decode byte 0xa6 in position 1203: invalid start byte
'utf-8' codec can't decode byte 0xa2 in position 44: invalid start byte
'utf-8' codec can't decode byte 0xac in position 396: invalid start byte

import chardet
with open("DATA/ATSDTA_ATSP600.csv", 'rb') as f:
for line in f:
code = chardet.detect(line)
#if code == {'confidence': 0.5, 'encoding': 'windows-1252'}:
if code != {'encoding': 'ascii', 'confidence': 1.0}:
print(code)
win = line.decode('windows-1252').split(',') #windows-1252
norm = line.decode('utf-8', 'ignore').split(',')
ascii = line.decode('ascii', "ignore").split(',')
ascii2 = line.decode('ISO-8859-1').split(',')

for w, n, a, a2 in zip(win, norm, ascii, ascii2):
if w != n:
print(w
​)
​ print(
n
​)
​
a, a2)
print(win[0])

​## Output​

{'encoding': 'windows-1252', 'confidence': 0.5}
"¦   " "   " "   " "¦   "
"040543"
{'encoding': 'windows-1252', 'confidence': 0.5}
"LEASE GREGPRU D ¬ETERSPM " "LEASE GREGPRU D ETERSPM
  " "LEASE GREGPRU D ETERSPM " "LEASE
GREGPRU D ¬ETERSPM "
"979643"
{'encoding': 'windows-1252', 'confidence': 0.5}
"¦   " "   " "   " "¦   "
"986979"
{'encoding': 'windows-1252', 'confidence': 0.5}
"WELLS FARGO &¢ COMPANY   " "WELLS FARGO & COMPANY
  " "WELLS FARGO & COMPANY   " "WELLS
FARGO &¢ COMPANY   "
"994946"
{'encoding': 'windows-1252', 'confidence': 0.5}
OSSOSSO¬¬O " OSSOSSOO " OSSOSSOO " OSSOSSO¬¬O "
"996535"



Vincent Davis
720-301-3003
-- 
https://mail.python.org/mailman/listinfo/python-list


EuroPython 2015: Call for Proposals has been extended

2015-04-14 Thread M.-A. Lemburg
Since we had Easter holidays and a very busy PyCon US 2015 during the
Call for Proposal (CFP) period, the Program work group (WG) has
decided to extend the date for the talk submission deadline until:

   Thursday, 2015-04-28, 23:59:59 CEST

Please submit your proposals through the EuroPython website:

   https://ep2015.europython.eu/en/call-for-proposals/


Need some inspiration ?
---

We are especially interested in the following topics in the context of
using and developing for and with Python:

 * Core Python
 * Alternative Python implementations: e.g. Jython, IronPython, PyPy,
   Stackless, etc.
 * Python libraries and extensions
 * Python 2 to 3 migration
 * Databases
 * Documentation
 * GUI Programming
 * Game Programming
 * Network Programming
 * Open Source Python projects
 * Packaging Issues
 * Programming Tools
 * Project Best Practices
 * Embedding and Extending
 * Education, Science and Math
 * Web-based Systems

Have a look at the talks presented at previous conferences to get an
idea of what we’re looking for:

 * EuroPython 2014 Talks - https://ep2014.europython.eu/en/schedule/schedule/
 * EuroPython 2013 Talks - https://ep2013.europython.eu/p3/schedule/ep2013/
 * EuroPython 2012 Talks - https://ep2013.europython.eu/p3/schedule/ep2012/
 * EuroPython Talk Videos - http://europython.tv/

This year at EuroPython we will have plenty of presentation slots
available for everyone:

 * Talks: 170 slots available (80x 30min, 85x 45min, 5x 60min)
 * Trainings: 20 slots
 * Posters: 25 slots
 * Help desks: 5 slots

More details are available on our CFP page:

https://ep2015.europython.eu/en/call-for-proposals/


Now is your chance to become a speaker at EuroPython !
--

EuroPython is the largest Python conference in Europe, providing you
with an ideal platform to let the Python community know about your
ideas, projects and thoughts.

 * Don’t be shy and submit your talks. We’d especially like to
   encourage first time speakers to submit their talks.

 * If you’ve already held a talk at another Python conference,
   perfect: send in your talk submission.

 * If your talk did not get accepted at one of the other Python
   conferences: try again at EuroPython.

See you in Bilbao !

Enjoy,
--
EuroPython 2015 Team
http://ep2015.europython.eu/
http://www.europython-society.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Working with Access tables and python scripts

2015-04-14 Thread accessnewbie
I have an existing extensive python script that I would like to modify slightly 
to run a different variation on a process.

I also have all the variables I need to run this script (about 20 in 
total)stored in an Access 2010 64 bit database. 

Is it possible to create a button on an Access form (or other GUI) to pass the 
information that is stored in the various fields in the database to the python 
script? Not all the values are in a single table. A query joining related 
tables would need to be done. Ideally I would like to execute the script from 
the Access data entry form immediately after entering the required data into 
the database.

If so, how might I go about accomplishing this? I am at a loss as to where to 
even start such a task.

Any and all help greatly appreciated.
-- 
https://mail.python.org/mailman/listinfo/python-list


A question about numpy

2015-04-14 Thread Paulo da Silva
I am new to numpy ...

Supposing I have 2 vectors v1 and v2 and a value (constant) k.
I want to build a vector r with all values of v1 greater than k and the
others from v2.

I found 2 ways, but not sure if they are the best solution:

1.
r1=v1.copy()
r2=v2.copy()
r1[r1=k]=0
r=r1+r2

2.
r=v1*(v1>=k)+v2*(v2https://mail.python.org/mailman/listinfo/python-list


Re: Working with Access tables and python scripts

2015-04-14 Thread hey . ho . fredl
I don't know how to start a Python script from Access, but you could definitely 
do it the other way around, reading the Access database from Python.

An example:
---
import pyodbc
ODBC_DRIVER = '{Microsoft Access Driver (*.mdb)}'

connstr = 'DRIVER={0};DBQ={1}'.format(ODBC_DRIVER, 'filename.mdb')
conn = pyodbc.connect(connstr)

cur = conn.cursor()
cur.execute('SELECT * FROM table')

---

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A question about numpy

2015-04-14 Thread Rob Gaddi
On Tue, 14 Apr 2015 23:41:56 +0100, Paulo da Silva wrote:

> Supposing I have 2 vectors v1 and v2 and a value (constant) k.
> I want to build a vector r with all values of v1 greater than k and the
> others from v2.
> 

You're looking for numpy.where() .

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A question about numpy

2015-04-14 Thread Paulo da Silva
On 14-04-2015 23:49, Rob Gaddi wrote:
> On Tue, 14 Apr 2015 23:41:56 +0100, Paulo da Silva wrote:
> 
>> Supposing I have 2 vectors v1 and v2 and a value (constant) k.
>> I want to build a vector r with all values of v1 greater than k and the
>> others from v2.
>>
> 
> You're looking for numpy.where() .
> 
That's it! Thank you.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Working with Access tables and python scripts

2015-04-14 Thread Emile van Sebille

On 4/14/2015 3:20 PM, accessnew...@gmail.com wrote:

> I have an existing extensive python script that I would like
> to modify slightly to run a different variation on a process.
>
> I also have all the variables I need to run this script (about
> 20 in total)stored in an Access 2010 64 bit database.
>
> Is it possible to create a button on an Access form (or other GUI)
> to pass the information that is stored in the various fields in the
> database to the python script? Not all the values are in a single
> table.
> A query joining related tables would need to be done. Ideally I would
> like to execute the script from the Access data entry form
> immediately after entering the required data into the database.

Yes -- I did so in excel using VB some 15 years ago.  Download and 
install Mark Hammond's win32 extensions.  I don't off the top of my head 
remember the details, but if no one else chimes in I could probably

dig out that code.

Emile




--
https://mail.python.org/mailman/listinfo/python-list


Re: Working with Access tables and python scripts

2015-04-14 Thread Mark Lawrence

On 15/04/2015 01:47, Emile van Sebille wrote:

On 4/14/2015 3:20 PM, accessnew...@gmail.com wrote:

 > I have an existing extensive python script that I would like
 > to modify slightly to run a different variation on a process.
 >
 > I also have all the variables I need to run this script (about
 > 20 in total)stored in an Access 2010 64 bit database.
 >
 > Is it possible to create a button on an Access form (or other GUI)
 > to pass the information that is stored in the various fields in the
 > database to the python script? Not all the values are in a single
 > table.
 > A query joining related tables would need to be done. Ideally I would
 > like to execute the script from the Access data entry form
 > immediately after entering the required data into the database.

Yes -- I did so in excel using VB some 15 years ago.  Download and
install Mark Hammond's win32 extensions.  I don't off the top of my head
remember the details, but if no one else chimes in I could probably
dig out that code.

Emile


People can always ask here 
https://mail.python.org/mailman/listinfo/python-win32 also available as 
gmane.comp.python.windows


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: find all multiplicands and multipliers for a number

2015-04-14 Thread Paul Rubin
Steven D'Aprano  writes:
> primes = sieve [2..]
> sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p > 0]
> In her paper http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf, Melissa
> O'Neill calls this the "Sleight on Eratosthenes".

Oh cool, I wrote very similar Haskell code and converted it to Python.
I probably saw it before though, so it looks like a case of
not-exactly-independent re-invention.

> def turner():
> nums = itertools.count(2)
> while True:
> prime = next(nums)
> yield prime
> nums = filter(lambda v, p=prime: (v % p) != 0, nums)

This is nice, though it will still hit the nesting limit about equally
soon, because of the nested filters.  I like the faster versions in the
O'Neill paper.

> On my computer, your recursive version is about 35% slower than the
> iterative version over the first 499 primes.

Interesting.  I wonder why it's slower.

A Hamming (or "5-smooth") number is a number with no prime factors
larger than 5.  So 20 (= 2*2*5) is a Hamming number but 21 (= 3*7) is
not.  The first few of them are:
  [1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24]

Here's an ugly computation of the millionth Hamming number, converted
from Haskell code that you've probably seen.  I'd be interested in
seeing a cleaner implementation.  


import itertools, time

def merge(a0,b0):
def advance(m): m[0] = next(m[1])
a = [next(a0), a0]
b = [next(b0), b0]
while True:
if a[0] == b[0]:
yield a[0]
advance(a)
advance(b)
elif a[0] < b[0]:
yield a[0]
advance(a)
else:
yield b[0]
advance(b)

def hamming(n):
hh = [1]
def hmap(k):
for i in itertools.count():
yield k * hh[i]

m = merge(hmap(2),merge(hmap(3), hmap(5)))
for i in xrange(n):
hh.append(next(m))
return hh[-1]

# this takes about 2.4 seconds on my i5 laptop with python 2.7,
# about 2.8 sec with python 3.3.2

t0=time.time()
print (hamming(10**6-1))
print (time.time()-t0)

-- 
https://mail.python.org/mailman/listinfo/python-list