Re: [Tutor] Moving from snippits to large projects?

2012-01-08 Thread Hugo Arts
On Mon, Jan 9, 2012 at 3:09 AM, Leam Hall  wrote:
> I'm taking the O'Reilly Python 2 course on-line, and enjoying it. Well, when
> Eclipse works, anyway. I'm still getting the hang of that.
>
> While my coding over the years has been small snippits in shell, PHP, and a
> little C, python, and perl, I've never made the transition from dozens of
> lines to hundreds or thousands. I'd like to start working on that transition
> but the projects I know about are much larger than my brain can handle and
> there are a lot of other corollary tools and practices to learn.
>
> How does one go from small to medium, to large, as a coder? Large projects,
> that is. I've gotten the "large as in too much pizza" thing down.  ;)
>

Well, the best advice I could offer is to get in over your head. Pick
a large project, think a bit about how you'd structure it, then jump
right in! This is what I did, and the result was that I learned so
much that I abandoned it about halfway through and started over,
saying "I went about this totally wrong, let's get it right this
time!"

That process repeated itself a lot of times, and each time I came out
with new lessons learned about how to structure large projects.
Honestly, learning by doing is the best. You'll be unhappy about a ton
of your projects, abandon some, finish others (honestly, just
finishing something should be enough to be proud of by my standards).
The important thing is to just code and realize it's ok to not know
what you're doing most of the time (well, as long as you're not
getting paid for it anyway).

A few things are invaluable when working with larger projects:

- the python debugger, pdb. Debugging with print statements is fine
for smaller stuff, but for complicated software, a debugger is a nice
tool to have
- version control. Crucial for working in a team, but even coding solo
working on something big it's nice to have branches and rollbacks.
You'd be best off just getting used to this and using it for all
projects. I work with git, but anything is better than nothing. pick
up a popular one and go with it.
- unit testing. Some people consider this optional (I never actually
got into it myself), but it's worth taking a look at.

I won't go into detail concerning any of these. None of them are
python specific anyway. I suggest you google them yourself and learn
gradually, by doing. It's the best way.

HTH,
Hugo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Moving from snippits to large projects?

2012-01-08 Thread Leam Hall
I'm taking the O'Reilly Python 2 course on-line, and enjoying it. Well, 
when Eclipse works, anyway. I'm still getting the hang of that.


While my coding over the years has been small snippits in shell, PHP, 
and a little C, python, and perl, I've never made the transition from 
dozens of lines to hundreds or thousands. I'd like to start working on 
that transition but the projects I know about are much larger than my 
brain can handle and there are a lot of other corollary tools and 
practices to learn.


How does one go from small to medium, to large, as a coder? Large 
projects, that is. I've gotten the "large as in too much pizza" thing 
down.  ;)


Leam
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] making a custom file parser?

2012-01-08 Thread Hugo Arts
On Mon, Jan 9, 2012 at 2:19 AM, Devin Jeanpierre  wrote:
>> Parsing XML with regular expressions is generally very bad idea. In
>> the general case, it's actually impossible. XML is not what is called
>> a regular language, and therefore cannot be parsed with regular
>> expressions. You can use regular expressions to grab a limited amount
>> of data from a limited set of XML files, but this is dangerous, hard,
>> and error-prone.
>
> Python regexes aren't regular, and this isn't XML.
>
> A working XML parser has been written using .NET regexes (sorry, no
> citation -- can't find it), and they only have one extra feature
> (recursion, of course). And it was dreadfully ugly and nasty and
> probably terrible to maintain -- that's the real cost of regexes.
>

IIRC, Python's only non-regular feature is backreferences though; I'm
pretty sure that isn't enough to parse XML. It does not make it
powerful enough to parse context-free languages. I really would like
that citation though, tried googling for it but not much turned up.
I'm not calling bs or anything, I don't know anything about .net
regexes and I'll readily believe it can be done (I just want to see
the code for myself). But really I still wouldn't dare try without a
feature set like perl 6's regexes. And even then..

You're technically correct (it's the best kind), but I feel like it
doesn't really take away the general correctness of my advice ;)

> In particular, his data actually does look regular.
>

Quite right. We haven't seen enough of it to be sure, but that little
bite seems parseable enough with some basic string methods and one or
two regexes. That's really all you need, and trying to do the whole
thing with pure regex is just needlessly overcomplicating things (I'm
pretty sure we all actually agree on that).

>> I'll assume that said "(.*)". There's still a few problems: < and >
>> shouldn't be escaped, which is why you're not getting any matches.
>> Also you shouldn't use * because it is greedy, matching as much as
>> possible. So it would match everything in between the first  and
>> the last  tag in the file, including other  tags
>> that might show up.
>
> On the "can you do work with this with regexes" angle: if units can be
> nested, then neither greedy nor non-greedy matching will work. That's
> a particular case where regular expressions can't work for your data.
>
>> Test it carefully, ditch elementtree, use as little regexes as
>> possible (string functions are your friends! startswith, split, strip,
>> et cetera) and you might end up with something that is only slightly
>> ugly and mostly works. That said, I'd still advise against it. turning
>> the files into valid XML and then using whatever XML parser you fancy
>> will probably be easier.
>
> He'd probably do that using regexes.
>

Yeah, that's what I was thinking when I said it too. Something like,
one regex to quote attributes, and one that adds close tags at the
earliest opportunity. Like right before a newline? It looks okay based
on just that sample, but it's really hard to say. The viability of
regexes depends so much on the dataset you have. If you can make the
dataset valid XML with just three regexes (quotes, end tags, comments)
then just parse it that way, that sounds like the simplest possible
option.

> Easiest way is probably to write a real parser using some PEG or CFG
> thingy. Less error-prone.
>

You mean like flex/bison? May be overkill, but then again, maybe not.
So much depends on the data.

> Overall agree with advice, though. Just being picky. Sorry.
>
> -- Devin
>
>

I love being picky myself, so I don't mind, as long as there is a
disclaimer somewhere ;) Cheers,
Hugo
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] making a custom file parser?

2012-01-08 Thread Devin Jeanpierre
> Parsing XML with regular expressions is generally very bad idea. In
> the general case, it's actually impossible. XML is not what is called
> a regular language, and therefore cannot be parsed with regular
> expressions. You can use regular expressions to grab a limited amount
> of data from a limited set of XML files, but this is dangerous, hard,
> and error-prone.

Python regexes aren't regular, and this isn't XML.

A working XML parser has been written using .NET regexes (sorry, no
citation -- can't find it), and they only have one extra feature
(recursion, of course). And it was dreadfully ugly and nasty and
probably terrible to maintain -- that's the real cost of regexes.

In particular, his data actually does look regular.

> I'll assume that said "(.*)". There's still a few problems: < and >
> shouldn't be escaped, which is why you're not getting any matches.
> Also you shouldn't use * because it is greedy, matching as much as
> possible. So it would match everything in between the first  and
> the last  tag in the file, including other  tags
> that might show up.

On the "can you do work with this with regexes" angle: if units can be
nested, then neither greedy nor non-greedy matching will work. That's
a particular case where regular expressions can't work for your data.

> Test it carefully, ditch elementtree, use as little regexes as
> possible (string functions are your friends! startswith, split, strip,
> et cetera) and you might end up with something that is only slightly
> ugly and mostly works. That said, I'd still advise against it. turning
> the files into valid XML and then using whatever XML parser you fancy
> will probably be easier.

He'd probably do that using regexes.

Easiest way is probably to write a real parser using some PEG or CFG
thingy. Less error-prone.

Overall agree with advice, though. Just being picky. Sorry.

-- Devin


On Sat, Jan 7, 2012 at 3:15 PM, Hugo Arts  wrote:
> On Sat, Jan 7, 2012 at 8:22 PM, Alex Hall  wrote:
>> I had planned to parse myself, but am not sure how to go about it. I
>> assume regular expressions, but I couldn't even find the amount of
>> units in the file by using:
>> unitReg=re.compile(r"\(*)\")
>> unitCount=unitReg.search(fileContents)
>> print "number of units: "+unitCount.len(groups())
>>
>> I just get an exception that "None type object has no attribute
>> groups", meaning that the search was unsuccessful. What I was hoping
>> to do was to grab everything between the opening and closing unit
>> tags, then read it one at a time and parse further. There is a tag
>> inside a unit tag called AttackTable which also terminates, so I would
>> need to pull that out and work with it separately. I probably just
>> have misunderstood how regular expressions and groups work...
>>
>
> Parsing XML with regular expressions is generally very bad idea. In
> the general case, it's actually impossible. XML is not what is called
> a regular language, and therefore cannot be parsed with regular
> expressions. You can use regular expressions to grab a limited amount
> of data from a limited set of XML files, but this is dangerous, hard,
> and error-prone.
>
> As long as you realize this, though, you could possibly give it a shot
> (here be dragons, you have been warned).
>
>> unitReg=re.compile(r"\(*)\")
>
> This is probably not what you actually did, because it fails with a
> different error:
>
 a = re.compile(r"\(*)\")
> Traceback (most recent call last):
>  File "", line 1, in 
>  File 
> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/re.py",
> line 188, in compile
>  File 
> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/re.py",
> line 243, in _compile
> sre_constants.error: nothing to repeat
>
> I'll assume that said "(.*)". There's still a few problems: < and >
> shouldn't be escaped, which is why you're not getting any matches.
> Also you shouldn't use * because it is greedy, matching as much as
> possible. So it would match everything in between the first  and
> the last  tag in the file, including other  tags
> that might show up. What you want is more like this:
>
> unit_reg = re.compile(r"(.*?)")
>
> Test it carefully, ditch elementtree, use as little regexes as
> possible (string functions are your friends! startswith, split, strip,
> et cetera) and you might end up with something that is only slightly
> ugly and mostly works. That said, I'd still advise against it. turning
> the files into valid XML and then using whatever XML parser you fancy
> will probably be easier. Adding quotes and closing tags and removing
> comments with regexes is still bad, but easier than parsing the whole
> thing with regexes.
>
> HTH,
> Hugo
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubs

Re: [Tutor] different behaviour in Idle shell vs Mac terminal

2012-01-08 Thread Alan Gauld

On 08/01/12 23:34, Adam Gold wrote:


I have short piece of code I'm using to print a string to

> the terminal one letter at a time.  It works fine when

I invoke the script from within Idle; each letter appears
afterthe preceding one according to the designated time

> interval.
> However if I run it in the Mac terminal
> ('python3 ./script.py'),
> there's a pause and then the whole string prints in one go.

Thats because you are writing to stdout rather than using print
The output is buffered and the terminal prints the output after the 
bufrfer is flushed, which happens at the end of the program

(probably when the file object is auto closed). if you use print
that shouldn't happen.

The alternative is to explicitly flush() the file after each write.


import sys
import time

text = "this text is printing one letter at a time..."
for char in text:
 sys.stdout.write(char)


either use
   print char,# comma suppresses \n

or

  sys.stdout.write(char)
  sys.stdout.flush()

HTH

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] different behaviour in Idle shell vs Mac terminal

2012-01-08 Thread Steven D'Aprano

Adam Gold wrote:

I have short piece of code I'm using to print a string to the terminal one 
letter at a time.  It works fine when I invoke the script from within Idle; 
each letter appears after the preceding one according to the designated time 
interval.  However if I run it in the Mac terminal ('python3 ./script.py'), 
there's a pause and then the whole string prints in one go.  Here's the 
relevant code:

import sys
import time

text = "this text is printing one letter at a time..."
for char in text:
sys.stdout.write(char)
time.sleep(0.03)

I'm thinking this may be a tty issue (is stdout going to the right terminal?)


It's a buffering issue.

[...]

P.S. if it's relevant, this is part of a simple financial maths program and 
it's used to display the results after certain inputs have been gathered.


To annoy your users? I'm not sure why you think it's a good idea to pretend 
that the computer has to type the letters one at a time. This isn't some 
stupid imaginary program in a Hollywood movie, I assume it is meant to 
actually be useful and usable, and trust me on this, waiting while the program 
pretends to type gets old *really* fast.


(What are you trying to emulate? A stock ticker or something? Do those things 
still even exist? I haven't seen one since the last time I watched the Addams 
Family TV series. The old one, in black & white.)


But if you must, after writing each character, call sys.stdout.flush() to 
flush the buffer.




--
Steven

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] different behaviour in Idle shell vs Mac terminal

2012-01-08 Thread Adam Gold

I have short piece of code I'm using to print a string to the terminal one 
letter at a time.  It works fine when I invoke the script from within Idle; 
each letter appears after the preceding one according to the designated time 
interval.  However if I run it in the Mac terminal ('python3 ./script.py'), 
there's a pause and then the whole string prints in one go.  Here's the 
relevant code:

import sys
import time

text = "this text is printing one letter at a time..."
for char in text:
    sys.stdout.write(char)
    time.sleep(0.03)

I'm thinking this may be a tty issue (is stdout going to the right terminal?) 
but I'm still finding my way and would therefore appreciate any guidance.  Of 
course if there's a better way of printing out one letter at a time, I'm also 
interested to know that.  Thanks.

P.S. if it's relevant, this is part of a simple financial maths program and 
it's used to display the results after certain inputs have been gathered.
  
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Zip, tar, and file handling

2012-01-08 Thread Alexander Etter

On Jan 6, 2012, at 22:57, daedae11  wrote:

> I was asked to write a program to move files between ZIP(.zip) and 
> TAR/GZIP(.tgz/.tar.gz) or TAR/BZIP2(.tbz/.tar.bz2) archive.
>  
> my code is:
>  
>  
> import zipfile;
> import tarfile;
> import os;
> from os import path ;
>  
> def showAllFiles(fileObj):
> if fileObj.filename.endswith("zip"):
> if isinstance(fileObj, zipfile.ZipFile):
> print "j"*20;
> for name in fileObj.namelist():
> print name;
> else:
> for name in fileObj.getnames():
> print name; 
>  
> def moveFile(srcObj, dstObj):
> fileName = raw_input("input the name of the file to move: ");
> srcObj.extract(fileName);
> if isinstance(dstObj, zipfile.ZipFile):
> dstObj.write(fileName);
> else:
> dstObj.addfile(tarfile.TarInfo(fileName));
> os.remove(fileName);
> 
> def main():
> intro = """
> enter a choice
> (M)ove file from source file to destinatiom file
> (S)how all the files in source file
> (Q)uit
> your choice is: """
> srcFile = raw_input("input the source file name: ");
> dstFile = raw_input("input the destination file name: ");
> while True:
> with ( zipfile.ZipFile(srcFile, "r") if srcFile.endswith("zip") else 
> tarfile.open(srcFile, "r"+":"+path.splitext(srcFile)[1][1:]) ) as srcObj, \
> ( zipfile.ZipFile(dstFile, "r") if
>dstFile.endswith("zip") else
> tarfile.open(dstFile, "w"+":"+path.splitext(dstFile)[1][1:]) ) as 
> dstObj:
> choice = raw_input(intro)[0].lower();
> if choice == "s":
> showAllFiles(srcObj);
> elif choice == "m":
> moveFile(srcObj, dstObj);
> elif choice == "q":
> break;
> else:
> print "invalid command!"
>  
> if __name__ == '__main__':
> main();
>  
> But there are some problems.
> 1. It could extract file successfully, but can't add files to .tar.gz file.
> 2. I think it's a little tedious, but I don't know how to improve it.
>  
> Please  give me some help , thank you!
>  
> daedae11
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

Hi there. I would start by handling file extensions other than ZIP in your 
first two functions. Why not handle if the file is a tgz or tbz within the 
functions. Also I don't see the purpose of the first function, "showallfiles" 
it prints out twenty "j"s?
Looking forward to your response. 
Alexander___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor