Re: [Tutor] find second occurance of string in line

2015-09-09 Thread Peter Otten
richard kappler wrote:

> On Tue, Sep 8, 2015 at 1:40 PM, Peter Otten <__pete...@web.de> wrote:

>> I'm inferring from the above that you do not really want the "second"
>> timestamp in the line -- there is no line left intace anyway;) -- but
>> rather
>> the one in the ... part.
>>
>> Here's a way to get these (all of them) with lxml:
>>
>> import lxml.etree
>>
>> tree = lxml.etree.parse("example.xml")
>> print tree.xpath("//objectdata/general/timestamp/text()")

> No no, I'm not sure how I can be much more clear, that is one (1) line of
> xml that I provided, not many, and I really do want what I said in the
> very beginning, the second instance of  for each of those
> lines.

It looks likes I was not clear enough: XML doesn't have the concept of 
lines. When you process XML "by line" you have buggy code.

> Got it figured out with guidance from Alan's response though:
> 
> #!/usr/bin/env python
> 
> with open("example.xml", 'r') as f:
> for line in f:
> if "objectdata" in line:
> if "" in line:
> x = ""
> a = ""
> first = line.index(x)
> second = line.index(x, first+1)
> b = line.index(a)
> c = line.index(a, b+1)
> y = second + 11
> timestamp = line[y:c]
> print timestamp

Just for fun take the five minutes to install lxml and compare the output of 
the two scripts. If it's the same now there's no harm switching to lxml, and 
you are making future failures less likely.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Creating lists with definite (n) items without repetitions

2015-09-09 Thread Francesco Loffredo via Tutor

Oscar Benjamin wrote:

The problem is that there are 26 people and they are divided into
groups of 3 each day. We would like to know if it is possible to
arrange it so that each player plays each other player exactly once
over some period of days.

It is not exactly possible to do this with 26 people in groups of 3.
Think about it from the perspective of 1 person. They must play
against all 25 other people in pairs with neither of the other people
repeated: the set of pairs they play against must partition the set of
other players. Clearly it can only work if the number of other players
is even.

I'm not sure but I think that maybe for an exact solution you need to
have n=1(mod6) or n=3(mod6) which gives:
n = 1, 3, 7, 9, 13, 15, 19, 21, 25, 27, ...

The formula for the number of triples if the exact solution exists is
n*(n-1)/6 which comes out as 26*25/6 = 108.3 (the formula doesn't
give an integer because the exact solution doesn't exist).


A quick solution is to add one "dummy" letter to the pool of the OP's 
golfers.
I used "!" as the dummy one. This way, you end up with 101 triples, 11 
of which contain the dummy player.
But this is better than the 25-item pool, that resulted in an incomplete 
set of triples (for example, A would never play with Z)
So, in your real-world problem, you will have 11 groups of 2 people 
instead of 3. Is this a problem?



import pprint, itertools
pool = "abcdefghijklmnopqrstuvwxyz!"

def maketriples(tuplelist):
final = []
used = set()
for a, b, c in tuplelist:
if ( ((a,b) in used) or ((b,c) in used) or ((a,c) in used) or 
((b,a) in used) or ((c,b) in used) or ((c,a) in used) ):

continue
else:
final.append((a, b, c))
used.add((a,b))
used.add((a,c))
used.add((b,c))
used.add((b,a))
used.add((c,a))
used.add((c,b))
return final

combos = list(itertools.combinations(pool, 3))
print("combos contains %s triples." % len(combos))

triples = maketriples(combos)

print("maketriples(combos) contains %s triples." % len(triples))
pprint.pprint(triples)

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] More Pythonic?

2015-09-09 Thread richard kappler
> It's not clear why you need the try...except: pass. Please provide some
more background information.

I don't need the try, this was more of a "are there different ways to do
this, which is better and why?" experiment. I am learning, so tend to write
script that is more brute force than elegant and pythonic, wish to write
better code. I do okay, but there are many nuances to Python that I just
haven't run across. For example:

> with open(sourcefile) as instream:
>with open(destfile, "a") as outstream:
>outstream.writelines(process_lines(instream))

I had no idea I could nest with statements like that. It seems obvious now,
but I didn't know.

For the record, I have made a couple other posts this morning that explain
the script constraints far better than I did here. For the sake of brevity
I shant repeat the info here other than to say it's not reading from stdin,
but from a log file to simulate stdin in a test environment.

regards, Richard

On Wed, Sep 9, 2015 at 9:37 AM, Peter Otten <__pete...@web.de> wrote:

> richard kappler wrote:
>
> > Would either or both of these work, if both, which is the better or more
> > Pythonic way to do it, and why?
> >
> > ###
> >
> > import whatIsNeeded
> >
> > writefile = open("writefile", 'a')
> >
> > with open(readfile, 'r') as f:
> > for line in f:
> > if keyword in line:
> > do stuff
> > f1.write(line)
> > else:
> > f1.write(line)
>
> Why do you invoke f1.write() twice?
>
> > writefile.close()
> >
> > ##
> >
> > import whatIsNeeded
> >
> > with open(readfile, 'r') as f:
> > for line in f:
> > try:
> > if keyword in line:
> > do stuff
> > except:
>
> What exceptions are you expecting here? Be explicit. You probably don't
> want
> to swallow a KeyboardInterrupt. And if something unexpected goes wrong a
> noisy complaint gives you the chance to either fix an underlying bug or
> explicitly handle the exception in future runs of the script.
>
> > do nothing
>
> That's spelt
>   pass
>
> > with open(writefile, 'a') as f1:
> > f1.write(line)
>
> Opening the file once per line written seems over-the-top to me.
>
> > ##
> >
> > or something else altogether?
>
> I tend to put the processing into into a generator. That makes it easy to
> replace the source or the consumer:
>
> def process_lines(instream):
> for line in instream:
> if keyword in line:
> do stuff
> yield line
>
> with open(sourcefile) as instream:
> with open(destfile, "a") as outstream:
> outstream.writelines(process_lines(instream))
>
> Now if you want to read from stdin and print to stdout:
>
> sys.stdout.writelines(process_lines(sys.stdin))
>
> > I'm thinking the first way is better as it only opens the files once
> > whereas it seems to me the second script would open and close the
> > writefile once per iteration, and the do nothing in the except seems just
> > wrong to me.
>
> It's not clear why you need the try...except: pass.
> Please provide some more background information.
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>



-- 

All internal models of the world are approximate. ~ Sebastian Thrun
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] A further question about opening and closing files

2015-09-09 Thread richard kappler
Under a different subject line (More Pythonic?) Steven D'Aprano commented:

> And this will repeatedly open the file, append one line, then close it
> again. Almost certainly not what you want -- it's wasteful and
> potentially expensive.

And I get that. It does bring up another question though. When using

with open(somefile, 'r') as f:
with open(filename, 'a') as f1:
for line in f:

the file being appended is opened and stays open while the loop iterates,
then the file closes when exiting the loop, yes? Does this not have the
potential to be expensive as well if you are writing a lot of data to the
file?

I did a little experiment:

>>> f1 = open("output/test.log", 'a')
>>> f1.write("this is a test")
>>> f1.write("this is a test")
>>> f1.write('why isn\'t this writing')
>>> f1.close()

monitoring test.log as I went. Nothing was written to the file until I
closed it, or at least that's the way it appeared to the text editor in
which I had test.log open (gedit). In gedit, when a file changes it tells
you and gives you the option to reload the file. This didn't happen until I
closed the file. So I'm presuming all the writes sat in a buffer in memory
until the file was closed, at which time they were written to the file.

Is that actually how it happens, and if so does that not also have the
potential to cause problems if memory is a concern?

regards, Richard
-- 

All internal models of the world are approximate. ~ Sebastian Thrun
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] iterating through a directory

2015-09-09 Thread Albert-Jan Roskam
> Date: Wed, 9 Sep 2015 09:32:34 -0400
> From: richkapp...@gmail.com
> To: tutor@python.org
> Subject: [Tutor] iterating through a directory
> 
> Yes, many questions today. I'm working on a data feed script that feeds
> 'events' into our test environment. In production, we monitor a camera that
> captures an image as product passes by, gathers information such as
> barcodes and package ID from the image, and then sends out the data as a
> line of xml to one place for further processing and sends the image to
> another place for storage. Here is where our software takes over, receiving
> the xml data and images from vendor equipment. Our software then processes
> the xml data and allows retrieval of specific images associated with each
> line of xml data. Our test environment must simulate the feed from the
> vendor equipment, so I'm writing a script that feeds the xml from an actual
> log to one place for processing, and pulls images from a pool of 20, which
> we recycle, to associate with each event and sends them to another dir for
> saving and searching.
> 
>  As my script iterates through each line of xml data (discussed yesterday)
> to simulate the feed from the vendor camera equipment, it parses ID
> information about the event then sends the line of data on. As it does so,
> it needs to pull the next image in line from the image pool directory,
> rename it, send it to a different directory for saving. I'm pretty solid on
> all of this except iterating through the image pool. My idea is to just
> keep looping through the image pool, as each line of xmldata is parsed, the
> next image in line gets pulled out, renamed with the identifying
> information from the xml data, and both are sent on to different places.
> 
> I only have one idea for doing this iterating through the image pool, and
> frankly I think the idea is pretty weak. The only thing I can come up with
> is to change the name of each image in the pool from something
> like 0219PS01CT1_2029_04_00044979.jpg to 1.jpg, 2.jpg, 3.jpg etc.,
> then doing something like:
> 
> i = 0
> 
> here begins my for line in file loop:
> if i == 20:
> i = 1
> else:
> i += 1
> do stuff with the xml file including get ID info
> rename i.jpg to IDinfo.jpg
> send it all on
> 
> That seems pretty barbaric, any thoughts on where to look for better ideas?
> I'm presuming there are modules out there that I am unaware of or
> capabilities I am unaware of in modules I do know a little about that I am
> missing.

I do not really understand what you intend to do, but the following modules 
might come in handy
-os  (os.rename, os.listdir)
-glob (glob.iglob or glob.glob)
-shutil (shutil.copy)



  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] More Pythonic?

2015-09-09 Thread Steven D'Aprano
On Wed, Sep 09, 2015 at 09:05:23AM -0400, richard kappler wrote:
> Would either or both of these work, if both, which is the better or more
> Pythonic way to do it, and why?

The first works, but isn't really the best way to do it:

>  ###
> 
> import whatIsNeeded
> 
> writefile = open("writefile", 'a')
> 
> with open(readfile, 'r') as f:
> for line in f:
> if keyword in line:
> do stuff
> f1.write(line)
> else:
> f1.write(line)
> 
> writefile.close()
> 
> ##

Better would be this:

with open("writefile", 'a') as outfile:
with open("readfile", 'r') as infile:
for line in infile:
if keyword in line:
do stuff
outfile.write(line)

(I think your intention is to always write the lines into the output 
file, but there are enough typos in your version that I can't be 
completely sure.)



This, on the other hand, is certainly not what you want:

> import whatIsNeeded
> 
> with open(readfile, 'r') as f:
> for line in f:
> try:
> if keyword in line:
> do stuff
> except:
> do nothing

Why are you ignoring *all errors*? That will make it impossible (or at 
least very hard) to cancel the script with Ctrl-C, and it will cover 
up programming errors. Apart from a very few expert uses, you should 
never use a bare except. If you really want to "ignore all errors", use 
`except Exception`, but even that is not good practice. You should list 
and catch only the precise errors that you know you wish to ignore and 
can safely handle. Everything else indicates a bug that needs fixing.

By the way, "do nothing" in Python is spelled "pass".


> with open(writefile, 'a') as f1:
> f1.write(line)

And this will repeatedly open the file, append one line, then close it 
again. Almost certainly not what you want -- it's wasteful and 
potentially expensive.


> ##
> 
> or something else altogether?
> 
> I'm thinking the first way is better as it only opens the files once
> whereas it seems to me the second script would open and close the writefile
> once per iteration, and the do nothing in the except seems just wrong to
> me. Is my thinking on target here?

Spot on target.



-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] iterating through a directory

2015-09-09 Thread richard kappler
Yes, many questions today. I'm working on a data feed script that feeds
'events' into our test environment. In production, we monitor a camera that
captures an image as product passes by, gathers information such as
barcodes and package ID from the image, and then sends out the data as a
line of xml to one place for further processing and sends the image to
another place for storage. Here is where our software takes over, receiving
the xml data and images from vendor equipment. Our software then processes
the xml data and allows retrieval of specific images associated with each
line of xml data. Our test environment must simulate the feed from the
vendor equipment, so I'm writing a script that feeds the xml from an actual
log to one place for processing, and pulls images from a pool of 20, which
we recycle, to associate with each event and sends them to another dir for
saving and searching.

 As my script iterates through each line of xml data (discussed yesterday)
to simulate the feed from the vendor camera equipment, it parses ID
information about the event then sends the line of data on. As it does so,
it needs to pull the next image in line from the image pool directory,
rename it, send it to a different directory for saving. I'm pretty solid on
all of this except iterating through the image pool. My idea is to just
keep looping through the image pool, as each line of xmldata is parsed, the
next image in line gets pulled out, renamed with the identifying
information from the xml data, and both are sent on to different places.

I only have one idea for doing this iterating through the image pool, and
frankly I think the idea is pretty weak. The only thing I can come up with
is to change the name of each image in the pool from something
like 0219PS01CT1_2029_04_00044979.jpg to 1.jpg, 2.jpg, 3.jpg etc.,
then doing something like:

i = 0

here begins my for line in file loop:
if i == 20:
i = 1
else:
i += 1
do stuff with the xml file including get ID info
rename i.jpg to IDinfo.jpg
send it all on

That seems pretty barbaric, any thoughts on where to look for better ideas?
I'm presuming there are modules out there that I am unaware of or
capabilities I am unaware of in modules I do know a little about that I am
missing.

-- 

All internal models of the world are approximate. ~ Sebastian Thrun
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] iterating through a directory

2015-09-09 Thread richard kappler
Albert-Jan, thanks for the response. shutil.copyfile does seem to be one of
the tools I need to make the copying, renaming the copy and saving it
elsewhere in one line instead of three or more.

Still not sure how to efficiently get the script to keep moving to the next
file in the directory though, in other words, for each iteration in the
loop, I want it to fetch, rename and send/save the next image in line. Hope
that brings better understanding.

Thanks for the tip!

regards, Richard

On Wed, Sep 9, 2015 at 9:46 AM, Albert-Jan Roskam 
wrote:

> > Date: Wed, 9 Sep 2015 09:32:34 -0400
> > From: richkapp...@gmail.com
> > To: tutor@python.org
> > Subject: [Tutor] iterating through a directory
>
> >
> > Yes, many questions today. I'm working on a data feed script that feeds
> > 'events' into our test environment. In production, we monitor a camera
> that
> > captures an image as product passes by, gathers information such as
> > barcodes and package ID from the image, and then sends out the data as a
> > line of xml to one place for further processing and sends the image to
> > another place for storage. Here is where our software takes over,
> receiving
> > the xml data and images from vendor equipment. Our software then
> processes
> > the xml data and allows retrieval of specific images associated with each
> > line of xml data. Our test environment must simulate the feed from the
> > vendor equipment, so I'm writing a script that feeds the xml from an
> actual
> > log to one place for processing, and pulls images from a pool of 20,
> which
> > we recycle, to associate with each event and sends them to another dir
> for
> > saving and searching.
> >
> > As my script iterates through each line of xml data (discussed yesterday)
> > to simulate the feed from the vendor camera equipment, it parses ID
> > information about the event then sends the line of data on. As it does
> so,
> > it needs to pull the next image in line from the image pool directory,
> > rename it, send it to a different directory for saving. I'm pretty solid
> on
> > all of this except iterating through the image pool. My idea is to just
> > keep looping through the image pool, as each line of xmldata is parsed,
> the
> > next image in line gets pulled out, renamed with the identifying
> > information from the xml data, and both are sent on to different places.
> >
> > I only have one idea for doing this iterating through the image pool, and
> > frankly I think the idea is pretty weak. The only thing I can come up
> with
> > is to change the name of each image in the pool from something
> > like 0219PS01CT1_2029_04_00044979.jpg to 1.jpg, 2.jpg, 3.jpg
> etc.,
> > then doing something like:
> >
> > i = 0
> >
> > here begins my for line in file loop:
> > if i == 20:
> > i = 1
> > else:
> > i += 1
> > do stuff with the xml file including get ID info
> > rename i.jpg to IDinfo.jpg
> > send it all on
> >
> > That seems pretty barbaric, any thoughts on where to look for better
> ideas?
> > I'm presuming there are modules out there that I am unaware of or
> > capabilities I am unaware of in modules I do know a little about that I
> am
> > missing.
>
> I do not really understand what you intend to do, but the following
> modules might come in handy
> -os  (os.rename, os.listdir)
> -glob (glob.iglob or glob.glob)
> -shutil (shutil.copy)
>
>
>
>


-- 

All internal models of the world are approximate. ~ Sebastian Thrun
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Laura Creighton
>I did a little experiment:
>
 f1 = open("output/test.log", 'a')
 f1.write("this is a test")
 f1.write("this is a test")
 f1.write('why isn\'t this writing')
 f1.close()

If you want the thing written out, use f1.flush() whenever you want to
make sure this happens.

If you want completely unbuffered writing, then you can open your file
this way, with f1 = open("output/test.log", 'a', 0) I think if you are
on windows you can only get unbuffered writing if you open your file
in binary mode.

Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Fwd: find second occurance of string in line

2015-09-09 Thread Laura Creighton
Peter Otten
>Those who regularly need different configurations probably use virtualenv, 
>or virtual machines when the differences are not limited to Python.

Use tox for this.
https://testrun.org/tox/latest/

However for development purposes it often helps to have a
--force the_one_that_I_want option (for command lines) or
a global variable, or a config file for modules.

How badly you want this depends on your own personal development
style, and how happy you are popping in and out of virtualenvs.  Many
people prefer to write their whole new thing for one library (say
elementtree) and then test it/port it against the other 2, one at a
time, making a complete set of patches for one adaptation at a time.
Other people prefer to write their code so that, feature by feature
they first get it to work with one library, and then with another, and
then with the third, and then they write the next new bit of code, so
that they never have to do a real port.

Life is messy enough that you often do a bit of this and a bit of the
other thing, even if you would prefer to not need to, especially if
hungry customers are demanding exactly what they need (and we don't
care about the other ways it will eventually work for other people).

Laura
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread richard kappler
Thanks, tried them both, both work great on Linux. Now I understand better.

regards, Richard

On Wed, Sep 9, 2015 at 11:28 AM, Laura Creighton  wrote:

> >I did a little experiment:
> >
>  f1 = open("output/test.log", 'a')
>  f1.write("this is a test")
>  f1.write("this is a test")
>  f1.write('why isn\'t this writing')
>  f1.close()
>
> If you want the thing written out, use f1.flush() whenever you want to
> make sure this happens.
>
> If you want completely unbuffered writing, then you can open your file
> this way, with f1 = open("output/test.log", 'a', 0) I think if you are
> on windows you can only get unbuffered writing if you open your file
> in binary mode.
>
> Laura
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>



-- 

All internal models of the world are approximate. ~ Sebastian Thrun
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Alan Gauld

On 09/09/15 20:42, Laura Creighton wrote:

In a message of Wed, 09 Sep 2015 20:25:06 +0100, Alan Gauld writes:

On 09/09/15 19:20, Laura Creighton wrote:
If you are working on a small platform - think mobile device - and it has
a single channel bus to the storage area then one of the worst things
you can do is write lots of small chunks of data to it. The overhead
(in hardware) of opening and locking the bus is almost as much as
the data transit time and so can choke the bus for a significant amount
of time (I'm talking milliseconds here but in real-time that's significant).

But if I shoot you with my laser cannon, I want you to get the
message that you are dead _now_ and not when some buffer fills up ...


There are two things about that:
1) human reaction time is measured in 100s of milliseconds so the delay
is not likely to be meaningful. If you do the flushes every 10ms 
instead
of every write (assuming you are writing frequently) nobody is 
likely to

notice.
2) Gamers tend not to be doing other things while playing, so you can 
pretty

   much monopolize the bus if you want to,

So if you know that you're the only game in town(sic) then go ahead and
flush everything to disk. It won't do much harm. But...

..., if your game engine is running on a server shared by other users and
some of them are  running critical apps (think a businesses billing or
accounting suite that must complete its run within a 1 hour window say) 
then
you become very unpopular quickly. In practice that means the sys admin 
will
see who is flattening the bus and nice that process down till it stops 
hurting

the others. That means your game now runs at 10% the CPU power it had
a while ago...

As programmers we very rarely have the control over our environment that
we like to think we do.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Steven D'Aprano
On Wed, Sep 09, 2015 at 08:20:44PM +0200, Laura Creighton wrote:
> In a message of Wed, 09 Sep 2015 17:42:05 +0100, Alan Gauld writes:
> >You can force the writes (I see Laura has shown how) but
> >mostly you should just let the OS do it's thing. Otherwise
> >you risk cluttering up the IO bus and preventing other
> >programs from writing their files.
> 
> Is this something we have to worry about these days?  I haven't
> worried about it for a long time, and write real time multiplayer
> games which demand unbuffered writes   Of course, things
> would be different if I were sending gigabytes of video down the
> pipe, but for the sort of small writes I am doing, I don't think
> there is any performance problem at all.
> 
> Anybody got some benchmarks so we can find out?

Good question!

There's definitely a performance hit, but it's not as big as I expected:

py> with Stopwatch():
... with open("/tmp/junk", "w") as f:
... for i in range(10):
... f.write("a")
...
time taken: 0.129952 seconds

py> with Stopwatch():
... with open("/tmp/junk", "w") as f:
... for i in range(10):
... f.write("a")
... f.flush()
...
time taken: 0.579273 seconds


What really gets expensive is doing a sync.

py> with Stopwatch():
... with open("/tmp/junk", "w") as f:
... fid = f.fileno()
... for i in range(10):
... f.write("a")
... f.flush()
... os.fsync(fid)
...
time taken: 123.283973 seconds


Yes, that's right. From half a second to two minutes.


-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] More Pythonic?

2015-09-09 Thread Peter Otten
Timo wrote:

> Op 09-09-15 om 15:41 schreef Steven D'Aprano:
>> On Wed, Sep 09, 2015 at 09:05:23AM -0400, richard kappler wrote:
>>> Would either or both of these work, if both, which is the better or more
>>> Pythonic way to do it, and why?
>> The first works, but isn't really the best way to do it:
>>
>>>   ###
>>>
>>> import whatIsNeeded
>>>
>>> writefile = open("writefile", 'a')
>>>
>>> with open(readfile, 'r') as f:
>>>  for line in f:
>>>  if keyword in line:
>>>  do stuff
>>>  f1.write(line)
>>>  else:
>>>  f1.write(line)
>>>
>>> writefile.close()
>>>
>>> ##
>> Better would be this:
>>
>> with open("writefile", 'a') as outfile:
>>  with open("readfile", 'r') as infile:
>>  for line in infile:
>>  if keyword in line:
>>  do stuff
>>  outfile.write(line)
>>  
> It's also possible to use multiple with statements on the same line. Can
> someone with more expert Python knowledge than me comment on whether
> it's different from using them separate as mentioned by Steven?
> 
> This is what I had in mind:
> 
> with open("writefile", 'a') as outfile, open("readfile", 'r') as infile:
>  pass  # Rest of the code here

This requires Python 2.7 or higher. Other than that the choice is merely a 
matter of taste. Both versions even produce the same bytecode:

$ cat nested_with.py 
def f():
with open("writefile", 'a') as outfile, open("readfile", 'r') as infile:
pass  # Rest of the code here

def g():
with open("writefile", 'a') as outfile:
with open("readfile", 'r') as infile:
pass  # Rest of the code here

print(f.__code__.co_code == g.__code__.co_code)
$ python nested_with.py 
True

Personally I find one item per with statement more readable and don't care 
about the extra indentation level.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] More Pythonic?

2015-09-09 Thread Timo

Op 09-09-15 om 15:41 schreef Steven D'Aprano:

On Wed, Sep 09, 2015 at 09:05:23AM -0400, richard kappler wrote:

Would either or both of these work, if both, which is the better or more
Pythonic way to do it, and why?

The first works, but isn't really the best way to do it:


  ###

import whatIsNeeded

writefile = open("writefile", 'a')

with open(readfile, 'r') as f:
 for line in f:
 if keyword in line:
 do stuff
 f1.write(line)
 else:
 f1.write(line)

writefile.close()

##

Better would be this:

with open("writefile", 'a') as outfile:
 with open("readfile", 'r') as infile:
 for line in infile:
 if keyword in line:
 do stuff
 outfile.write(line)
 
It's also possible to use multiple with statements on the same line. Can 
someone with more expert Python knowledge than me comment on whether 
it's different from using them separate as mentioned by Steven?


This is what I had in mind:

with open("writefile", 'a') as outfile, open("readfile", 'r') as infile:
pass  # Rest of the code here


Timo


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Steven D'Aprano
On Wed, Sep 09, 2015 at 10:24:57AM -0400, richard kappler wrote:
> Under a different subject line (More Pythonic?) Steven D'Aprano commented:
> 
> > And this will repeatedly open the file, append one line, then close it
> > again. Almost certainly not what you want -- it's wasteful and
> > potentially expensive.
> 
> And I get that. It does bring up another question though. When using
> 
> with open(somefile, 'r') as f:
> with open(filename, 'a') as f1:
> for line in f:
> 
> the file being appended is opened and stays open while the loop iterates,
> then the file closes when exiting the loop, yes? 

The file closes when exiting the *with block*, not necessarily the loop. 
Consider:

with open(blah blah blah) as f:
for line in f:
pass
time.sleep(120)
# file isn't closed until we get here

Even if the file is empty, and there are no lines, it will be held open 
for two minutes.


> Does this not have the
> potential to be expensive as well if you are writing a lot of data to the
> file?

Er, expensive in what way?

Yes, I suppose it is more expensive to write 1 gigabyte of data to a 
file than to write 1 byte. What's your point? If you want to write 1 GB, 
then you have to write 1 GB, and it will take as long as it takes.

Look at it this way: suppose you have to hammer 1000 nails into a fence. 
You can grab your hammer out of your tool box, hammer one nail, put the 
hammer back in the tool box and close the lid, open the lid, take the 
hammer out again, hammer one nail, put the hammer back in the tool box, 
close the lid, open the lid again, take out the hammer...

Or you take the hammer out, hammer 1000 nails, then put the hammer away. 
Sure, while you are hammering those 1000 nails, you're not mowing the 
lawn, painting the porch, walking the dog or any of the dozen other jobs 
you have to do, but you have to hammer those nails eventually.

> I did a little experiment:
> 
> >>> f1 = open("output/test.log", 'a')
> >>> f1.write("this is a test")
> >>> f1.write("this is a test")
> >>> f1.write('why isn\'t this writing')
> >>> f1.close()
> 
> monitoring test.log as I went. Nothing was written to the file until I
> closed it, or at least that's the way it appeared to the text editor in
> which I had test.log open (gedit). In gedit, when a file changes it tells
> you and gives you the option to reload the file. This didn't happen until I
> closed the file. So I'm presuming all the writes sat in a buffer in memory
> until the file was closed, at which time they were written to the file.

Correct. All modern operating systems do that. Writing to disk is slow, 
*hundreds of thousands of times slower* than writing to memory, so the 
operating system will queue up a reasonable amount of data before 
actually forcing it to the disk drive.

 
> Is that actually how it happens, and if so does that not also have the
> potential to cause problems if memory is a concern?

No. The operating system is not stupid enough to queue up gigabytes of 
data. Typically the buffer is a something like 128 KB of data (I think), 
or maybe a MB or so. Writing a couple of short lines of text won't fill 
it, which is why you don't see any change until you actually close the 
file. Try writing a million lines, and you'll see something different. 
The OS will flush the buffer when it is full, or when you close the 
file, whichever happens first.

If you know that you're going to take a long time to fill the buffer, 
say you're performing a really slow calculation, and your data is 
trickling in really slowly, then you might do a file.flush() every few 
seconds or so. Or if you're writing an ACID database. But for normal 
use, don't try to out-smart the OS, because you will fail. This is 
really specialised know-how.

Have you noticed how slow gedit is to save files? That's because the 
gedit programmers thought they were smarter than the OS, so every time 
they write a file, they call flush() and sync(). Possibly multiple 
times. All that happens is that they slow the writing down greatly. 
Other text editors let the OS manage this process, and saving is 
effectively instantaneous. With gedit, there's a visible pause when it 
saves. (At least in all the versions of gedit I've used.)

And the data is not any more safe than the other text editors, 
because when the OS has written to the hard drive, there is no guarantee 
that the data has hit the platter yet. Hard drives themselves contain 
buffers, and they won't actually write data to the platter until they 
are good and ready.

-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Creating lists with definite (n) items without repetitions

2015-09-09 Thread Francesco Loffredo via Tutor

On 09/09/2015 18:59, Oscar Benjamin wrote:

On 9 September 2015 at 12:05, Francesco Loffredo via Tutor
 wrote:

A quick solution is to add one "dummy" letter to the pool of the OP's
golfers.
I used "!" as the dummy one. This way, you end up with 101 triples, 11 of
which contain the dummy player.
But this is better than the 25-item pool, that resulted in an incomplete set
of triples (for example, A would never play with Z)
So, in your real-world problem, you will have 11 groups of 2 people instead
of 3. Is this a problem?


import pprint, itertools
pool = "abcdefghijklmnopqrstuvwxyz!"

def maketriples(tuplelist):
 final = []
 used = set()
 for a, b, c in tuplelist:
 if ( ((a,b) in used) or ((b,c) in used) or ((a,c) in used) or ((b,a)
in used) or ((c,b) in used) or ((c,a) in used) ):
 continue
 else:
 final.append((a, b, c))
 used.add((a,b))
 used.add((a,c))
 used.add((b,c))
 used.add((b,a))
 used.add((c,a))
 used.add((c,b))
 return final

combos = list(itertools.combinations(pool, 3))
print("combos contains %s triples." % len(combos))

triples = maketriples(combos)

print("maketriples(combos) contains %s triples." % len(triples))
pprint.pprint(triples)

I don't think the code above works. For n=27 it should count 117
(according to the formula I showed) but instead it comes up with 101.

I tried it with a smaller n by setting pool to range(1, 9+1) meaning
that n=9. The output is:

combos contains 84 triples.
maketriples(combos) contains 8 triples.
[(1, 2, 3),
  (1, 4, 5),
  (1, 6, 7),
  (1, 8, 9),
  (2, 4, 6),
  (2, 5, 7),
  (3, 4, 7),
  (3, 5, 6)]

However I can construct a set of 12 triples containing each number
exactly 4 times which is the exact Steiner triple system:

1 6 8
1 2 3
1 5 9
1 4 7
2 6 7
2 4 9
2 5 8
3 5 7
3 6 9
3 8 4
4 5 6
7 8 9

This is the number of triples predicted by the formula: 9*(9-1)/6 = 12

--
Oscar


That's very interesting! This takes me to my question to Tutors:
what's wrong with the above code?

Francesco
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] iterating through a directory

2015-09-09 Thread Alan Gauld

On 09/09/15 15:29, richard kappler wrote:


Still not sure how to efficiently get the script to keep moving to the next
file in the directory though, in other words, for each iteration in the
loop, I want it to fetch, rename and send/save the next image in line. Hope
that brings better understanding.


Sounds like you want a circular list.

The traditional way to generate a circular
index into a list is to use the modulo (%) operator

But the itertools module gives you a better option with
the cycle function:

import itertools as it

for img in it.cycle(os.listdir(my_img_path)):
process(img)

HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Alan Gauld

On 09/09/15 15:24, richard kappler wrote:


f1 = open("output/test.log", 'a')
f1.write("this is a test")
f1.write("this is a test")
f1.write('why isn\'t this writing')
f1.close()


monitoring test.log as I went. Nothing was written to the file until I
closed it, or at least that's the way it appeared to the text editor


For a short example like this its true, for a bigger example the
buffer will be flushed periodically, as it fills up.
This is not a Python thing it's an OS feature, the same is true
for any program. Its much more efficient use of the IO bus.
(Its also why you should always explicitly close a file opened
for writing - unless using with which does it for you)

You can force the writes (I see Laura has shown how) but
mostly you should just let the OS do it's thing. Otherwise
you risk cluttering up the IO bus and preventing other
programs from writing their files.

HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Creating lists with definite (n) items without repetitions

2015-09-09 Thread Oscar Benjamin
On 9 September 2015 at 12:05, Francesco Loffredo via Tutor
 wrote:
> Oscar Benjamin wrote:
>
> The problem is that there are 26 people and they are divided into
> groups of 3 each day. We would like to know if it is possible to
> arrange it so that each player plays each other player exactly once
> over some period of days.
>
> It is not exactly possible to do this with 26 people in groups of 3.
> Think about it from the perspective of 1 person. They must play
> against all 25 other people in pairs with neither of the other people
> repeated: the set of pairs they play against must partition the set of
> other players. Clearly it can only work if the number of other players
> is even.
>
> I'm not sure but I think that maybe for an exact solution you need to
> have n=1(mod6) or n=3(mod6) which gives:
> n = 1, 3, 7, 9, 13, 15, 19, 21, 25, 27, ...
>
> The formula for the number of triples if the exact solution exists is
> n*(n-1)/6 which comes out as 26*25/6 = 108.3 (the formula doesn't
> give an integer because the exact solution doesn't exist).
> 
>
> A quick solution is to add one "dummy" letter to the pool of the OP's
> golfers.
> I used "!" as the dummy one. This way, you end up with 101 triples, 11 of
> which contain the dummy player.
> But this is better than the 25-item pool, that resulted in an incomplete set
> of triples (for example, A would never play with Z)
> So, in your real-world problem, you will have 11 groups of 2 people instead
> of 3. Is this a problem?
>
>
> import pprint, itertools
> pool = "abcdefghijklmnopqrstuvwxyz!"
>
> def maketriples(tuplelist):
> final = []
> used = set()
> for a, b, c in tuplelist:
> if ( ((a,b) in used) or ((b,c) in used) or ((a,c) in used) or ((b,a)
> in used) or ((c,b) in used) or ((c,a) in used) ):
> continue
> else:
> final.append((a, b, c))
> used.add((a,b))
> used.add((a,c))
> used.add((b,c))
> used.add((b,a))
> used.add((c,a))
> used.add((c,b))
> return final
>
> combos = list(itertools.combinations(pool, 3))
> print("combos contains %s triples." % len(combos))
>
> triples = maketriples(combos)
>
> print("maketriples(combos) contains %s triples." % len(triples))
> pprint.pprint(triples)

I don't think the code above works. For n=27 it should count 117
(according to the formula I showed) but instead it comes up with 101.

I tried it with a smaller n by setting pool to range(1, 9+1) meaning
that n=9. The output is:

combos contains 84 triples.
maketriples(combos) contains 8 triples.
[(1, 2, 3),
 (1, 4, 5),
 (1, 6, 7),
 (1, 8, 9),
 (2, 4, 6),
 (2, 5, 7),
 (3, 4, 7),
 (3, 5, 6)]

However I can construct a set of 12 triples containing each number
exactly 4 times which is the exact Steiner triple system:

1 6 8
1 2 3
1 5 9
1 4 7
2 6 7
2 4 9
2 5 8
3 5 7
3 6 9
3 8 4
4 5 6
7 8 9

This is the number of triples predicted by the formula: 9*(9-1)/6 = 12

--
Oscar
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Laura Creighton
In a message of Wed, 09 Sep 2015 17:42:05 +0100, Alan Gauld writes:
>You can force the writes (I see Laura has shown how) but
>mostly you should just let the OS do it's thing. Otherwise
>you risk cluttering up the IO bus and preventing other
>programs from writing their files.

Is this something we have to worry about these days?  I haven't
worried about it for a long time, and write real time multiplayer
games which demand unbuffered writes   Of course, things
would be different if I were sending gigabytes of video down the
pipe, but for the sort of small writes I am doing, I don't think
there is any performance problem at all.

Anybody got some benchmarks so we can find out?

Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Alan Gauld

On 09/09/15 19:20, Laura Creighton wrote:

In a message of Wed, 09 Sep 2015 17:42:05 +0100, Alan Gauld writes:

You can force the writes (I see Laura has shown how) but
mostly you should just let the OS do it's thing. Otherwise
you risk cluttering up the IO bus and preventing other
programs from writing their files.

Is this something we have to worry about these days?  I haven't
worried about it for a long time, and write real time multiplayer
games which demand unbuffered writes   Of course, things
would be different if I were sending gigabytes of video down the
pipe, but for the sort of small writes I am doing, I don't think
there is any performance problem at all.

Anybody got some benchmarks so we can find out?

Laura

If you are working on a small platform - think mobile device - and it has
a single channel bus to the storage area then one of the worst things
you can do is write lots of small chunks of data to it. The overhead
(in hardware) of opening and locking the bus is almost as much as
the data transit time and so can choke the bus for a significant amount
of time (I'm talking milliseconds here but in real-time that's significant).

But even on a major OS platform bus contention does occasionally rear
its head. I've seen multi-processor web servers "lock up" due to too many
threads dumping data at once. Managing the data bus is (part of) what
the OS is there to do, it's best to let it do its job, second guessing 
it is

rarely the right thing.

Remember, the impact is never on your own program it's on all the
other processes running on the same platform. There are usually tools
to monitor the IO bus performance though, so it's fairly easy to
diagnose/check.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] iterating through a directory

2015-09-09 Thread Mark Lawrence

On 09/09/2015 14:32, richard kappler wrote:

Yes, many questions today. I'm working on a data feed script that feeds
'events' into our test environment. In production, we monitor a camera that
captures an image as product passes by, gathers information such as
barcodes and package ID from the image, and then sends out the data as a
line of xml to one place for further processing and sends the image to
another place for storage. Here is where our software takes over, receiving
the xml data and images from vendor equipment. Our software then processes
the xml data and allows retrieval of specific images associated with each
line of xml data. Our test environment must simulate the feed from the
vendor equipment, so I'm writing a script that feeds the xml from an actual
log to one place for processing, and pulls images from a pool of 20, which
we recycle, to associate with each event and sends them to another dir for
saving and searching.

  As my script iterates through each line of xml data (discussed yesterday)
to simulate the feed from the vendor camera equipment, it parses ID
information about the event then sends the line of data on. As it does so,
it needs to pull the next image in line from the image pool directory,
rename it, send it to a different directory for saving. I'm pretty solid on
all of this except iterating through the image pool. My idea is to just
keep looping through the image pool, as each line of xmldata is parsed, the
next image in line gets pulled out, renamed with the identifying
information from the xml data, and both are sent on to different places.

I only have one idea for doing this iterating through the image pool, and
frankly I think the idea is pretty weak. The only thing I can come up with
is to change the name of each image in the pool from something
like 0219PS01CT1_2029_04_00044979.jpg to 1.jpg, 2.jpg, 3.jpg etc.,
then doing something like:

i = 0

here begins my for line in file loop:
 if i == 20:
 i = 1
 else:
 i += 1
 do stuff with the xml file including get ID info
 rename i.jpg to IDinfo.jpg
 send it all on

That seems pretty barbaric, any thoughts on where to look for better ideas?
I'm presuming there are modules out there that I am unaware of or
capabilities I am unaware of in modules I do know a little about that I am
missing.



I'm not sure what you're trying to do so maybe 
https://docs.python.org/3/library/collections.html#collections.deque or 
https://docs.python.org/3/library/itertools.html#itertools.cycle


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A further question about opening and closing files

2015-09-09 Thread Laura Creighton
In a message of Wed, 09 Sep 2015 20:25:06 +0100, Alan Gauld writes:
>On 09/09/15 19:20, Laura Creighton wrote:
>If you are working on a small platform - think mobile device - and it has
>a single channel bus to the storage area then one of the worst things
>you can do is write lots of small chunks of data to it. The overhead
>(in hardware) of opening and locking the bus is almost as much as
>the data transit time and so can choke the bus for a significant amount
>of time (I'm talking milliseconds here but in real-time that's significant).

But if I shoot you with my laser cannon, I want you to get the
message that you are dead _now_ and not when some buffer fills up ...

Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor