Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-02 Thread Nick W
I'd suggest having the newfile open after outfile is defined also a close
statement on newfile - or use it with 'with' such as:

... and replace the last line like so:
with open(outfile, 'w') as newfile:
main(mystring, infile, newfile)

(and looking muchly improved, well done)
Nick


On Fri, Feb 1, 2013 at 8:57 PM, Scurvy Scott  wrote:

> And just for the records sake, this is what I've gotten and you guys
> should see obviously that you helped a lot and I learned a thing or
> two so I won't have to ask the same silly questions next time:
>
>
>
>
> def main(mystring, infile, outfile):
> with open('infile', 'r') as inF:
> for index, line in enumerate(inF):
> if myString in line:
> newfile.write("string %s found on line
> #%d" (line, index))
> print "complete."
>
>
> if __name__ == '__main__':
>import sys
>newfile = open('outfile', 'w')
>help_text = "usage: python scanfile.py STRINGTOSEARCH
> IMPORTFILENAME OUTPUTFILENAME"
>if '-h' in sys.argv or '--help' in sys.argv or len(sys.argv) == 0:
>print (help_text)
>sys.exit()
>myString = sys.argv[1]
>infile = sys.argv[2]
>outfile = sys.argv[3]
>main(mystring, infile, outfile)
>
> Look right to you? Looks okay to me, except maybe the three ORs in the
> information line, is there a more pythonic way to accomplish that
> task?
>
> Scott
>
> On Fri, Feb 1, 2013 at 8:31 PM, Scurvy Scott  wrote:
> >> Best practice is to check if your program is being run as a script
> before
> >> doing anything. That way you can still import the module for testing or
> >> similar:
> >>
> >>
> >> def main(mystring, infile, outfile):
> >># do stuff here
> >>
> >>
> >> if __name__ == '__main__':
> >># Running as a script.
> >>import sys
> >>mystring = sys.argv[1]
> >>infile = sys.argv[2]
> >>outfile = sys.argv[3]
> >>main(mystring, infile, outfile)
> >>
> >>
> >>
> >> Best practice for scripts (not just Python scripts, but *any* script)
> is to
> >> provide help when asked. Insert this after the "import sys" line,
> before you
> >> start processing:
> >>
> >>if '-h' in sys.argv or '--help' in sys.argv:
> >>print(help_text)
> >>sys.exit()
> >>
> >>
> >>
> >> If your argument processing is more complicated that above, you should
> use
> >> one of the three argument parsing modules that Python provides:
> >>
> >> http://docs.python.org/2/library/getopt.html
> >> http://docs.python.org/2/library/optparse.html (deprecated -- do not
> use
> >> this for new code)
> >> http://docs.python.org/2/library/argparse.html
> >>
> >>
> >> getopt is (in my opinion) the simplest to get started, but the weakest.
> >>
> >> There are also third-party argument parsers that you could use. Here's
> one
> >> which I have never used but am intrigued by:
> >>
> >> http://docopt.org/
> >>
> >>
> >>
> >> --
> >> Steven
> >>
> >> ___
> >> Tutor maillist  -  Tutor@python.org
> >> To unsubscribe or change subscription options:
> >> http://mail.python.org/mailman/listinfo/tutor
> >
> > Steve-
> >  thanks a lot for showing me the if __name__ = main part
> > I've often wondered how it was used and it didn't make sense until I
> > saw it in my own code if that makes any sense.
> > Also appreciate the help on the "instructional" side of things.
> >
> > One question related to the instruction aspect- does this make sense to
> you?
> >
> > If len(sys.argv) == 0:
> > print "usage: etc etc etc"
> >
> >
> >
> > Nick, Dave, and Steve, again, you guys are awesome. Thanks for all your
> help.
> >
> > Scott
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-02 Thread Alan Gauld

On 02/02/13 04:57, Scurvy Scott wrote:

It may just be an email thing but...


def main(mystring, infile, outfile):
 with open('infile', 'r') as inF:
for index, line in enumerate(inF):
if myString in line:
newfile.write("string %s found on line #%d" 
(line, index))
print "complete."


The print should be inside the function not outside.

And main is probably not the best name. You could call it
printFoundString or somesuch...

'main' is usually used to collect all the program driver code that you 
currently have under the if name... test. The stuff you wouldn't ever 
use if importing as a module. Your way works fine too, a main is not 
obligatory. :-)



if __name__ == '__main__':
import sys
newfile = open('outfile', 'w')
help_text = "usage: python scanfile.py STRINGTOSEARCH
IMPORTFILENAME OUTPUTFILENAME"
if '-h' in sys.argv or '--help' in sys.argv or len(sys.argv) == 0:


The last test should be == 1 since the program name will always be 
there. But in fact you need the string and file args too so it

should really be:

len(sys.argv) < 4

anything less than 4 args and your code breaks...


myString = sys.argv[1]
infile = sys.argv[2]
outfile = sys.argv[3]


HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Steven D'Aprano

On 02/02/13 15:31, Scurvy Scott wrote:

One question related to the instruction aspect- does this make sense to you?

If len(sys.argv) == 0:
 print "usage: etc etc etc"



Right idea, but not quite correct.

sys.argv always includes at least one item, the name of the script being 
called. This is why you will often see code like this, to extract the actual 
arguments:


argv = sys.argv[1:]  # slice excluding the first item
if len(argv) == 0:
usage()
else:
main(argv)


or similar.


--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Scurvy Scott
And just for the records sake, this is what I've gotten and you guys
should see obviously that you helped a lot and I learned a thing or
two so I won't have to ask the same silly questions next time:




def main(mystring, infile, outfile):
with open('infile', 'r') as inF:
for index, line in enumerate(inF):
if myString in line:
newfile.write("string %s found on line #%d" 
(line, index))
print "complete."


if __name__ == '__main__':
   import sys
   newfile = open('outfile', 'w')
   help_text = "usage: python scanfile.py STRINGTOSEARCH
IMPORTFILENAME OUTPUTFILENAME"
   if '-h' in sys.argv or '--help' in sys.argv or len(sys.argv) == 0:
   print (help_text)
   sys.exit()
   myString = sys.argv[1]
   infile = sys.argv[2]
   outfile = sys.argv[3]
   main(mystring, infile, outfile)

Look right to you? Looks okay to me, except maybe the three ORs in the
information line, is there a more pythonic way to accomplish that
task?

Scott

On Fri, Feb 1, 2013 at 8:31 PM, Scurvy Scott  wrote:
>> Best practice is to check if your program is being run as a script before
>> doing anything. That way you can still import the module for testing or
>> similar:
>>
>>
>> def main(mystring, infile, outfile):
>># do stuff here
>>
>>
>> if __name__ == '__main__':
>># Running as a script.
>>import sys
>>mystring = sys.argv[1]
>>infile = sys.argv[2]
>>outfile = sys.argv[3]
>>main(mystring, infile, outfile)
>>
>>
>>
>> Best practice for scripts (not just Python scripts, but *any* script) is to
>> provide help when asked. Insert this after the "import sys" line, before you
>> start processing:
>>
>>if '-h' in sys.argv or '--help' in sys.argv:
>>print(help_text)
>>sys.exit()
>>
>>
>>
>> If your argument processing is more complicated that above, you should use
>> one of the three argument parsing modules that Python provides:
>>
>> http://docs.python.org/2/library/getopt.html
>> http://docs.python.org/2/library/optparse.html (deprecated -- do not use
>> this for new code)
>> http://docs.python.org/2/library/argparse.html
>>
>>
>> getopt is (in my opinion) the simplest to get started, but the weakest.
>>
>> There are also third-party argument parsers that you could use. Here's one
>> which I have never used but am intrigued by:
>>
>> http://docopt.org/
>>
>>
>>
>> --
>> Steven
>>
>> ___
>> Tutor maillist  -  Tutor@python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>
> Steve-
>  thanks a lot for showing me the if __name__ = main part
> I've often wondered how it was used and it didn't make sense until I
> saw it in my own code if that makes any sense.
> Also appreciate the help on the "instructional" side of things.
>
> One question related to the instruction aspect- does this make sense to you?
>
> If len(sys.argv) == 0:
> print "usage: etc etc etc"
>
>
>
> Nick, Dave, and Steve, again, you guys are awesome. Thanks for all your help.
>
> Scott
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Mitya Sirenef

On 02/01/2013 11:31 PM, Scurvy Scott wrote:



> Steve-
> thanks a lot for showing me the if __name__ = main part
> I've often wondered how it was used and it didn't make sense until I
> saw it in my own code if that makes any sense.
> Also appreciate the help on the "instructional" side of things.
>
> One question related to the instruction aspect- does this make sense 
to you?

>
> If len(sys.argv) == 0:
> print "usage: etc etc etc"


If should not be capitalized; sys.argv always has the name
of the script as the first item.

So you need to do something like:

if len(sys.argv) != 3:
print "usage: ..."

HTH!  -m


--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

Emotion is primarily about nothing and much of it remains about nothing to
the end.  George Santayana

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Scurvy Scott
> Best practice is to check if your program is being run as a script before
> doing anything. That way you can still import the module for testing or
> similar:
>
>
> def main(mystring, infile, outfile):
># do stuff here
>
>
> if __name__ == '__main__':
># Running as a script.
>import sys
>mystring = sys.argv[1]
>infile = sys.argv[2]
>outfile = sys.argv[3]
>main(mystring, infile, outfile)
>
>
>
> Best practice for scripts (not just Python scripts, but *any* script) is to
> provide help when asked. Insert this after the "import sys" line, before you
> start processing:
>
>if '-h' in sys.argv or '--help' in sys.argv:
>print(help_text)
>sys.exit()
>
>
>
> If your argument processing is more complicated that above, you should use
> one of the three argument parsing modules that Python provides:
>
> http://docs.python.org/2/library/getopt.html
> http://docs.python.org/2/library/optparse.html (deprecated -- do not use
> this for new code)
> http://docs.python.org/2/library/argparse.html
>
>
> getopt is (in my opinion) the simplest to get started, but the weakest.
>
> There are also third-party argument parsers that you could use. Here's one
> which I have never used but am intrigued by:
>
> http://docopt.org/
>
>
>
> --
> Steven
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

Steve-
 thanks a lot for showing me the if __name__ = main part
I've often wondered how it was used and it didn't make sense until I
saw it in my own code if that makes any sense.
Also appreciate the help on the "instructional" side of things.

One question related to the instruction aspect- does this make sense to you?

If len(sys.argv) == 0:
print "usage: etc etc etc"



Nick, Dave, and Steve, again, you guys are awesome. Thanks for all your help.

Scott
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Steven D'Aprano

On 02/02/13 08:24, Scurvy Scott wrote:

One last question on this topic..

I'd like to call the files and the string form the command line like

Python whatever.py STRINGTOSEARCH NEWFILE FILETOOPEN

My understanding is that it would be accomplished as such

import sys

myString = sys.argv[1]
filetoopen = sys.argv[2]
newfile = sys.argv[3]


ETC ETC CODE HERE

Is this correct/pythonic? Is there a more recommended way? Am I retarded?




Best practice is to check if your program is being run as a script before doing 
anything. That way you can still import the module for testing or similar:


def main(mystring, infile, outfile):
   # do stuff here


if __name__ == '__main__':
   # Running as a script.
   import sys
   mystring = sys.argv[1]
   infile = sys.argv[2]
   outfile = sys.argv[3]
   main(mystring, infile, outfile)



Best practice for scripts (not just Python scripts, but *any* script) is to provide help 
when asked. Insert this after the "import sys" line, before you start 
processing:

   if '-h' in sys.argv or '--help' in sys.argv:
   print(help_text)
   sys.exit()



If your argument processing is more complicated that above, you should use one 
of the three argument parsing modules that Python provides:

http://docs.python.org/2/library/getopt.html
http://docs.python.org/2/library/optparse.html (deprecated -- do not use this 
for new code)
http://docs.python.org/2/library/argparse.html


getopt is (in my opinion) the simplest to get started, but the weakest.

There are also third-party argument parsers that you could use. Here's one 
which I have never used but am intrigued by:

http://docopt.org/



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Dave Angel

On 02/01/2013 04:24 PM, Scurvy Scott wrote:

One last question on this topic..

I'd like to call the files and the string form the command line like

Python whatever.py STRINGTOSEARCH NEWFILE FILETOOPEN

My understanding is that it would be accomplished as such

import sys

myString = sys.argv[1]
filetoopen = sys.argv[2]
newfile = sys.argv[3]


Notice that you have 2 and 3 reversed from your description above. 
Neither one is wrong, but you have to pick one of them.


If it were my utility, I'd not specify any outfile on the commandline, 
but use redirection as is normal in Linux/Unix/Windows.  In other words, 
just use print as your output, and don't have an argument for the output 
file.






ETC ETC CODE HERE



If it were my code, I'd be first checking the number of arguments ( 
len(sys.argv) ).  If the number isn't exactly right, print a usage 
string and exit the program.


If your commandline gets any more complex, consider using the argparse 
module.


http://docs.python.org/2/library/argparse.html?highlight=argparse#argparse

--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Scurvy Scott
One last question on this topic..

I'd like to call the files and the string form the command line like

Python whatever.py STRINGTOSEARCH NEWFILE FILETOOPEN

My understanding is that it would be accomplished as such

import sys

myString = sys.argv[1]
filetoopen = sys.argv[2]
newfile = sys.argv[3]


ETC ETC CODE HERE

Is this correct/pythonic? Is there a more recommended way? Am I retarded?

Thanks again in advance,
Scott
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Scurvy Scott
>
> Why not just use grep ?
>
Honestly this seemed like a good excuse to write some code and learn
some stuff, secondly, it honestly just didn't dawn on me.

Also, the code y'all both showed me was exactly what I was looking for
and I'm planning on using Nicks code as a template to improve upon.

As always, thank you guys a lot, it usually only takes a couple of
emails from y'all to get my brain working correctly.

Be safe,
Scott
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Dave Angel

On 02/01/2013 03:09 PM, Scurvy Scott wrote:

Hey all how're things?

I'm hoping for some guidance on a problem I'm trying to work through.
I know this has been previously covered on this list but I'm hoping it
won't bother you guys to run through it again.

My basic program I'm attempting to create is like this..

I want to read from a large, very large file.
I want to find a certain string
if it finds the string I would like to select the first 15-20
characters pre and proceeding the string and then output that new
string to a new file along with the line the string was located on
within the file.


Why not just use grep ?



It seems fairly straight forward but I'm wondering if y'all can point
me to a direction that would help me accomplish this..

Firstly I know I can read a file and search for the string with (a
portion of this code was found on stackoverflow and is not mine and
some of it is my own)



First, you probably want to do something to quit when you get your first 
match.  If you do want to continue finding matches, then you'd have to 
change the location of that open() on the newfile.  Currently, it'll 
throw out any earlier contents, and just write the match.


The linenum is easy, using enumerate.


with open('largeFile', 'r') as inF:
 for line in inF:


   for linenum, line in enumerate(inF):



 myString = "The String"


This should be moved to a location before the loop;  it's a waste 
reassigning it every time through the loop.



 if 'myString' in line:
 f = open(thenewfile', 'w')
 f.write(myString)
 f.close()


   break #quit upon first match



I guess what I'm looking for then is tips on A)My stated goal of also
writing the 15-20 characters before and after myString to the new file
and
B)finding the line number and writing that to the file as well.

Any information you can give me or pointers would be awesome, thanks in advance.

I'm on Ubuntu 12.10 running LXDE and working with Python 2.7



About giving the 15 characters before and after the match:

Is it sufficient to truncate that spec at the line boundaries?  What I 
mean is that if the match occurs at column 10, do you really need the 
last 5 characters of the previous line?   Likewise, if it occurs near 
the end of the line, do you need some from the next line(s) ?



If you never need to show more than the current line, then you can parse 
the line (write a separate function).  If you have to go 15 characters 
earlier in the file, then consider using file.seek


http://docs.python.org/2/library/stdtypes.html?highlight=seek#file.seek

The catch to that is that it messes up the position in the file, so if 
you do want multiple matches, you'll need to use file.tell to save and 
restore the location to continue reading lines.


Lots of other options, but it all depends on what you REALLY want.



--
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Nick W
quite easy to do; just use enumerate - as so:
myString = "The String"
with open('largeFile', 'r') as inF:
for index, line in enumerate(inF):
#myString = "The String" ##Not here because otherwise this gets run
for every single line of the large file (which is nasty waste of resources)
if 'myString' in line:
with open(thenewfile', 'w') as f:
f.write("Line #%d has string: %s"  (index, line))
That will print the whole line into the new file, If you only want the
characters before that use find and take a slice of the line instead.
HTH
Nick


On Fri, Feb 1, 2013 at 12:09 PM, Scurvy Scott  wrote:

> Hey all how're things?
>
> I'm hoping for some guidance on a problem I'm trying to work through.
> I know this has been previously covered on this list but I'm hoping it
> won't bother you guys to run through it again.
>
> My basic program I'm attempting to create is like this..
>
> I want to read from a large, very large file.
> I want to find a certain string
> if it finds the string I would like to select the first 15-20
> characters pre and proceeding the string and then output that new
> string to a new file along with the line the string was located on
> within the file.
>
> It seems fairly straight forward but I'm wondering if y'all can point
> me to a direction that would help me accomplish this..
>
> Firstly I know I can read a file and search for the string with (a
> portion of this code was found on stackoverflow and is not mine and
> some of it is my own)
>
> with open('largeFile', 'r') as inF:
> for line in inF:
> myString = "The String"
> if 'myString' in line:
> f = open(thenewfile', 'w')
> f.write(myString)
> f.close()
>
> I guess what I'm looking for then is tips on A)My stated goal of also
> writing the 15-20 characters before and after myString to the new file
> and
> B)finding the line number and writing that to the file as well.
>
> Any information you can give me or pointers would be awesome, thanks in
> advance.
>
> I'm on Ubuntu 12.10 running LXDE and working with Python 2.7
>
> Scott
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Read from large text file, parse, find string, print string + line number to second text file.

2013-02-01 Thread Scurvy Scott
Hey all how're things?

I'm hoping for some guidance on a problem I'm trying to work through.
I know this has been previously covered on this list but I'm hoping it
won't bother you guys to run through it again.

My basic program I'm attempting to create is like this..

I want to read from a large, very large file.
I want to find a certain string
if it finds the string I would like to select the first 15-20
characters pre and proceeding the string and then output that new
string to a new file along with the line the string was located on
within the file.

It seems fairly straight forward but I'm wondering if y'all can point
me to a direction that would help me accomplish this..

Firstly I know I can read a file and search for the string with (a
portion of this code was found on stackoverflow and is not mine and
some of it is my own)

with open('largeFile', 'r') as inF:
for line in inF:
myString = "The String"
if 'myString' in line:
f = open(thenewfile', 'w')
f.write(myString)
f.close()

I guess what I'm looking for then is tips on A)My stated goal of also
writing the 15-20 characters before and after myString to the new file
and
B)finding the line number and writing that to the file as well.

Any information you can give me or pointers would be awesome, thanks in advance.

I'm on Ubuntu 12.10 running LXDE and working with Python 2.7

Scott
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor