[Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Afonso Duarte
Dear All,

 

I'm new to Python and started to use it to search text strings in big
(500Mb) txt files. 

I have a list on text file (e.g. A.txt) that I want to use as a key to
search another file (e.g. B.txt), organized in the following way:

 

A.txt:

 

Aaa

Bbb

Ccc

Ddd

.

.

.

 

B.txt

 

Bbb

1234

Xxx

234

 

 

I want to use A.txt to search in B.txt and have as output the original
search entry (e.g. Bbb) followed by the line that follows it in the B.txt
(e.g.  Bbb / 1234).

I wrote the following script:

 

 

object = open(B.txt', 'r')

lista = open(A.txt', 'r')

searches = lista.readlines()

for line in object.readlines():

 for word in searches:

  if word in line: 

   print line+'\n'

 

 

 

But from here I only get the searching entry and not the line afterwards, I
tried to google it but I got lost and didn't manage to do it.

Any ideas ? I guess that this is basic scripting but I just started .

 

Best 

 

Afonso

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread BRAGA, Bruno
On Thursday, May 10, 2012, Afonso Duarte adua...@itqb.unl.pt wrote:
 Dear All,



 I’m new to Python and started to use it to search text strings in big
(500Mb) txt files.

 I have a list on text file (e.g. A.txt) that I want to use as a key to
search another file (e.g. B.txt), organized in the following way:



 A.txt:



 Aaa

 Bbb

 Ccc

 Ddd

 .

 .

 .



 B.txt



 Bbb

 1234

 Xxx

 234





 I want to use A.txt to search in B.txt and have as output the original
search entry (e.g. Bbb) followed by the line that follows it in the B.txt
(e.g.  Bbb / 1234).

 I wrote the following script:





 object = open(B.txt', 'r')

 lista = open(A.txt', 'r')

 searches = lista.readlines()

 for line in object.readlines():

  for word in searches:

   if word in line:

print line+'\n'







 But from here I only get the searching entry and not the line afterwards,
I tried to google it but I got lost and didn’t manage to do it.

 Any ideas ? I guess that this is basic scripting but I just started .

Not sure I understood the question... But:
- are you trying to grep the text file? (simpler than programming in
python, IMO)
- if you have multiple matches of any of the keys from A file in a sungle
line of B file, the script above will print it multiple times
- you need not add new line (\n) in the print statement, unless you want it
to print a blank line between results

Based on the example you gave, the matching Bbb value in B and A are the
same, so actually line is being printed, but it is just the same as word...





 Best



 Afonso

-- 
Sent from Gmail Mobile
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread aduarte

On 2012-05-09 15:22, BRAGA, Bruno wrote:

On Thursday, May 10, 2012, Afonso Duarte adua...@itqb.unl.pt [1]
wrote:

Dear All,

 

I’m new to Python and started to use it to search text strings in

big (500Mb) txt files.
 

I have a list on text file (e.g. A.txt) that I want to use as a key

to search another file (e.g. B.txt), organized in the following way:


 

A.txt:

 

Aaa

 

Bbb

Ccc

Ddd

.

.

.

 

B.txt

 

Bbb

1234


  Xxx


234

 

 

I want to use A.txt to search in B.txt and have as output the

original search entry (e.g. Bbb) followed by the line that follows it
in the B.txt (e.g.  Bbb / 1234).
 

I wrote the following script:

 

 

object = open(B.txt', 'r')

lista = open(A.txt', 'r')

searches = lista.readlines()

 

for line in object.readlines():

 for word in searches:

  if word in line:

   print line+'n'

 

 

 

 

But from here I only get the searching entry and not the line

afterwards, I tried to google it but I got lost and didn’t manage to
do it.


Any ideas ? I guess that this is basic scripting but I just started

.




Not sure I understood the question... But:
- are you trying to grep the text file? (simpler than programming
in python, IMO)



- if you have multiple matches of any of the keys from A file in a
sungle line of B file, the script above will print it multiple times


true, I did not mention, but the entries in file A.txt only appear once 
in b.txt.




 - you need not add new line (n) in the print statement, unless you
want it to print a blank line between results


true


Based on the example you gave, the matching Bbb value in B and A are
the same, so actually line is being printed, but it is just the same
as word...



exactly! but what I want is that plus the value that proceeds that line 
in the B.txt i.e.


Bbb
1234

Best

Afonso




 

Best

 

Afonso

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Dave Angel
On 05/09/2012 10:00 AM, Afonso Duarte wrote:
 Dear All,

  

 I'm new to Python and started to use it to search text strings in big
 (500Mb) txt files. 

 I have a list on text file (e.g. A.txt) that I want to use as a key to
 search another file (e.g. B.txt), organized in the following way:

  

 A.txt:

  

 Aaa

 Bbb

 Ccc

 Ddd

 .

 .

 .

  

 B.txt

  

 Bbb

 1234

 Xxx

 234

  

  

 I want to use A.txt to search in B.txt and have as output the original
 search entry (e.g. Bbb) followed by the line that follows it in the B.txt
 (e.g.  Bbb / 1234).

 I wrote the following script:

  

  

 object = open(B.txt', 'r')

 lista = open(A.txt', 'r')

 searches = lista.readlines()

 for line in object.readlines():

  for word in searches:

   if word in line: 

print line+'\n'

  

  

  

 But from here I only get the searching entry and not the line afterwards, I
 tried to google it but I got lost and didn't manage to do it.

 Any ideas ? I guess that this is basic scripting but I just started .

  

 Best 

  

 Afonso


Please post your messages as plain-text.   The double-spacing I get is
very annoying.

There's a lot you don't say, which is implied in your code.

Are the lines in file B.txt really alternating:
 
key1
data for key1
key2
data for key2
...

Are the key lines in file B.txt exact messages, or do they just
contain the key somewhere in the line?   Your code assumes the latter,
but the whole thing could be much simpler if it were always an exact match.

Are the keys in A.txt unique?  If so, you could store them in a set, and
make lookup basically instantaneous.

I think the real question you had was how to access the line following
the key, once you matched the key.

Something like this should do it (untested)

lines = iter( object )
for key in lines:
linedata = lines.next()
if key in  mydictionary:
print key, --, linedata



Main caveat I can see is the file had better have an even number of lines.

-- 

DaveA
//

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Afonso Duarte


-Original Message-
From: Dave Angel [mailto:d...@davea.name] 
Sent: woensdag 9 mei 2012 15:52
To: Afonso Duarte
Cc: tutor@python.org
Subject: Re: [Tutor] Script to search in string of values from file A in
file B

On 05/09/2012 10:00 AM, Afonso Duarte wrote:
 Dear All,
 I'm new to Python and started to use it to search text strings in big
 (500Mb) txt files. 
 I have a list on text file (e.g. A.txt) that I want to use as a key to 
 search another file (e.g. B.txt), organized in the following way:

 A.txt:

 Aaa
 Bbb
 Ccc
 Ddd
 .
 .
 .

  

 B.txt
 Bbb
 1234
 Xxx
 234

 I want to use A.txt to search in B.txt and have as output the original 
 search entry (e.g. Bbb) followed by the line that follows it in the 
 B.txt (e.g.  Bbb / 1234).
 I wrote the following script:

  

  

 object = open(B.txt', 'r')
 lista = open(A.txt', 'r')
 searches = lista.readlines()
 for line in object.readlines():
  for word in searches:
   if word in line: 
print line+'\n'

  

 But from here I only get the searching entry and not the line 
 afterwards, I tried to google it but I got lost and didn't manage to do
it.
 Any ideas ? I guess that this is basic scripting but I just started .

 Best

 Afonso


Please post your messages as plain-text.   The double-spacing I get is
very annoying.

Sorry for that my outlook mess-it-up

There's a lot you don't say, which is implied in your code.
Are the lines in file B.txt really alternating:
 
key1
data for key1
key2
data for key2
...

Sure, that's why I describe them in the email like that and didn't say that
they weren't

Are the key lines in file B.txt exact messages, or do they just
contain the key somewhere in the line? 
  Your code assumes the latter,
but the whole thing could be much simpler if it were always an exact match.

The entry in B has text before and after (the size of that text changes from
entry to entry.


Are the keys in A.txt unique?  If so, you could store them in a set, and
make lookup basically instantaneous.

That indeed I didn't refer, the entries from A are unique in B


I think the real question you had was how to access the line following the
key, once you matched the key.

True that is my real question (as the code above works just for the title
line, I basically want to print the next line of the B.txt for each entry)

Something like this should do it (untested)

lines = iter( object )
for key in lines:
linedata = lines.next()
if key in  mydictionary:
   print key, --, linedata


Main caveat I can see is the file had better have an even number of lines.


That changes from file to file, and its unlikely i have all even number.

Thanks


Afonso


-- 

DaveA
//

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Dave Angel
On 05/09/2012 11:04 AM, Afonso Duarte wrote:
 
 
 -Original Message-
 From: Dave Angel [mailto:d...@davea.name] 
 SNIP

 Please post your messages as plain-text.   The double-spacing I get is
 very annoying.
 
 Sorry for that my outlook mess-it-up

I'm sure there's a setting to say use plain-text.  In Thunderbird, i
tell it that any message to forums is to be plain-text.

 
 There's a lot you don't say, which is implied in your code.
 Are the lines in file B.txt really alternating:

 key1
 data for key1
 key2
 data for key2
 ...
 
 Sure, that's why I describe them in the email like that and didn't say that
 they weren't
 
 Are the key lines in file B.txt exact messages, or do they just
 contain the key somewhere in the line? 
  Your code assumes the latter,
 but the whole thing could be much simpler if it were always an exact match.
 
 The entry in B has text before and after (the size of that text changes from
 entry to entry.

In other words, the line pairs are not like your sample, but more like:

trash  key1more trash
Useful associated data for the previous key
trash2 key2more trash
Useful associated ata for the previous key


 
 
 Are the keys in A.txt unique?  If so, you could store them in a set, and
 make lookup basically instantaneous.
 
 That indeed I didn't refer, the entries from A are unique in B

Not what I asked.  Are the keys in A.txt ever present more than once in
A.txt ?  But then again, if the key line can contain garbage before
and/or after the key, then the set idea is moot anyway.

 
 
 I think the real question you had was how to access the line following the
 key, once you matched the key.
 
 True that is my real question (as the code above works just for the title
 line, I basically want to print the next line of the B.txt for each entry)
 
 Something like this should do it (untested)

 lines = iter( object )
 for key in lines:
linedata = lines.next()
if key in  mydictionary:
  print key, --, linedata
 
 
 Main caveat I can see is the file had better have an even number of lines.
 
 
 That changes from file to file, and its unlikely i have all even number.

In that case, what do you use for data of the last key?


If you really have to handle the case where there is a final key with no
data, then you'll have to detect that case, and make up the data
separately.  That could be done with a try block, but this is probably
clearer:

rawlines = object.readlines()
if len(rawlines) %2 != 0:
rawlines +=   #add an extra line
lines = iter(rawlines)

for keyline in lines:
linedata = lines.next()
for word in searches:
if word in keyline:
print word, --, linedata


 
 Thanks
 
 
 Afonso
 
 


-- 

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Dave Angel
SNIP


 If you really have to handle the case where there is a final key with no
 data, then you'll have to detect that case, and make up the data
 separately.  That could be done with a try block, but this is probably
 clearer:

 rawlines = object.readlines()
 if len(rawlines) %2 != 0:
 rawlines +=   #add an extra line

Oops, that should have been
   rawlines.append()  or mayberawlines.append(\n)

 lines = iter(rawlines)

 for keyline in lines:
 linedata = lines.next()
 for word in searches:
 if word in keyline:
 print word, --, linedata




-- 

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Mark Lawrence

On 09/05/2012 15:00, Afonso Duarte wrote:


object = open(B.txt', 'r')



You'll already received some sound advice, so I'd just like to point out 
that your object will override the built-in object, apologies if 
somebody has already said this and I've missed it.




Afonso
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



--
Cheers.

Mark Lawrence.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] odd behavior when renaming a file

2012-05-09 Thread Joel Goldstick
import os
def pre_process():
if os.path.isfile('revelex.csv'):
os.rename('revelex.csv', 'revelex.tmp')
print Renamed ok
else:
print Exiting, no revelex.csv file available
exit()
out_file = open('revelex.csv', 'w')
# etc.

if __name__ == '__main__':
pre_process()


When I run the code above it works file if run from the file.  But
when I import it and run it from another file it renames the file but
then prints Exiting, no revelex.csv file available



-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread aduarte


Dear All,

Sorry it seems that I got the wrong mailing list to subscribe ...


I got the idea that this list was open to newbies ... by the answers I 
got I see that I was wrong





In that case, what do you use for data of the last key?


If you really have to handle the case where there is a final key with 
no

data, then you'll have to detect that case, and make up the data
separately.  That could be done with a try block, but this is probably
clearer:

rawlines = object.readlines()
if len(rawlines) %2 != 0:
rawlines +=   #add an extra line
lines = iter(rawlines)

for keyline in lines:
linedata = lines.next()
for word in searches:
if word in keyline:
print word, --, linedata



 after chatting in other mailing lists about other languages I realized 
that this mailing list is not in my league for python ...
 Interestingly I did got a strange advice from this list: try awk ... 
of Perl for the job, as Python is kind of tricky to print the next line 
that you selected (yes that was my question and I still don't understand 
how ppl advise me to insert new lines in 500Mb files and so on to do 
it...)


Once again sorry about the time.

Cheers

Afonso




On 2012-05-09 16:16, Dave Angel wrote:

On 05/09/2012 11:04 AM, Afonso Duarte wrote:



-Original Message-
From: Dave Angel [mailto:d...@davea.name]
SNIP


Please post your messages as plain-text.   The double-spacing I get 
is

very annoying.


Sorry for that my outlook mess-it-up


I'm sure there's a setting to say use plain-text.  In Thunderbird, i
tell it that any message to forums is to be plain-text.




There's a lot you don't say, which is implied in your code.
Are the lines in file B.txt really alternating:

key1
data for key1
key2
data for key2
...


Sure, that's why I describe them in the email like that and didn't 
say that

they weren't


Are the key lines in file B.txt exact messages, or do they just
contain the key somewhere in the line?
 Your code assumes the latter,
but the whole thing could be much simpler if it were always an 
exact match.


The entry in B has text before and after (the size of that text 
changes from

entry to entry.


In other words, the line pairs are not like your sample, but more 
like:


trash  key1more trash
Useful associated data for the previous key
trash2 key2more trash
Useful associated ata for the previous key





Are the keys in A.txt unique?  If so, you could store them in a 
set, and

make lookup basically instantaneous.

That indeed I didn't refer, the entries from A are unique in B


Not what I asked.  Are the keys in A.txt ever present more than once 
in

A.txt ?  But then again, if the key line can contain garbage before
and/or after the key, then the set idea is moot anyway.




I think the real question you had was how to access the line 
following the

key, once you matched the key.

True that is my real question (as the code above works just for the 
title
line, I basically want to print the next line of the B.txt for each 
entry)



Something like this should do it (untested)

lines = iter( object )
for key in lines:
   linedata = lines.next()
   if key in  mydictionary:
print key, --, linedata



Main caveat I can see is the file had better have an even number of 
lines.



That changes from file to file, and its unlikely i have all even 
number.


In that case, what do you use for data of the last key?


If you really have to handle the case where there is a final key with 
no

data, then you'll have to detect that case, and make up the data
separately.  That could be done with a try block, but this is 
probably

clearer:

rawlines = object.readlines()
if len(rawlines) %2 != 0:
rawlines +=   #add an extra line
lines = iter(rawlines)

for keyline in lines:
linedata = lines.next()
for word in searches:
if word in keyline:
print word, --, linedata




Thanks


Afonso




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Joel Goldstick
On Wed, May 9, 2012 at 10:00 AM, Afonso Duarte adua...@itqb.unl.pt wrote:
 Dear All,



 I’m new to Python and started to use it to search text strings in big
 (500Mb) txt files.

 I have a list on text file (e.g. A.txt) that I want to use as a key to
 search another file (e.g. B.txt), organized in the following way:



 A.txt:



 Aaa

 Bbb

 Ccc

 Ddd

 .

 .

 .



 B.txt



 Bbb

 1234

 Xxx

 234





 I want to use A.txt to search in B.txt and have as output the original
 search entry (e.g. Bbb) followed by the line that follows it in the B.txt
 (e.g.  Bbb / 1234).

 I wrote the following script:





 object = open(B.txt', 'r')

 lista = open(A.txt', 'r')

 searches = lista.readlines()

 for line in object.readlines():
  for word in searches:
   if word in line:
    print line+'\n'




Don't give up on this group so quickly.  You will get lots of help here.

As to your problem:  Do you know about enumerate?  Learn about it
here: http://docs.python.org/library/functions.html#enumerate

if you change your code above to:
   for index, word in enumerate line:
   print line, word[index+1]

I think you will get what you are looking for






 But from here I only get the searching entry and not the line afterwards, I
 tried to google it but I got lost and didn’t manage to do it.

 Any ideas ? I guess that this is basic scripting but I just started .



 Best



 Afonso


 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor




-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Joel Goldstick
On Wed, May 9, 2012 at 3:40 PM, Joel Goldstick joel.goldst...@gmail.com wrote:
 On Wed, May 9, 2012 at 10:00 AM, Afonso Duarte adua...@itqb.unl.pt wrote:
 Dear All,



 I’m new to Python and started to use it to search text strings in big
 (500Mb) txt files.

 I have a list on text file (e.g. A.txt) that I want to use as a key to
 search another file (e.g. B.txt), organized in the following way:



 A.txt:



 Aaa

 Bbb

 Ccc

 Ddd

 .

 .

 .



 B.txt



 Bbb

 1234

 Xxx

 234





 I want to use A.txt to search in B.txt and have as output the original
 search entry (e.g. Bbb) followed by the line that follows it in the B.txt
 (e.g.  Bbb / 1234).

 I wrote the following script:





 object = open(B.txt', 'r')

 lista = open(A.txt', 'r')

 searches = lista.readlines()

 for line in object.readlines():
  for word in searches:
   if word in line:
    print line+'\n'




 Don't give up on this group so quickly.  You will get lots of help here.

 As to your problem:  Do you know about enumerate?  Learn about it
 here: http://docs.python.org/library/functions.html#enumerate

 if you change your code above to:
   for index, word in enumerate line:
       print line, word[index+1]

 I think you will get what you are looking for

My mistake :  I meant this:
  my_lines = object.readlines()  # note, not a good thing to name
something object. Its a class
  for index, line in enumerate(my_lines):
  for word in searches:
   if word in line:
print line
print my_lines[index+1]

Sorry for the crazy earlier post


 But from here I only get the searching entry and not the line afterwards, I
 tried to google it but I got lost and didn’t manage to do it.

 Any ideas ? I guess that this is basic scripting but I just started .



 Best



 Afonso


 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor




 --
 Joel Goldstick



-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] odd behavior when renaming a file

2012-05-09 Thread Walter Prins
Hi,

On 9 May 2012 20:26, Joel Goldstick joel.goldst...@gmail.com wrote:

 import os
 def pre_process():
if os.path.isfile('revelex.csv'):
os.rename('revelex.csv', 'revelex.tmp')
print Renamed ok
else:
print Exiting, no revelex.csv file available
exit()
out_file = open('revelex.csv', 'w')
# etc.

 if __name__ == '__main__':
pre_process()


 When I run the code above it works file if run from the file.  But
 when I import it and run it from another file it renames the file but
 then prints Exiting, no revelex.csv file available



Can you post where/how you call this from another file? Anyway, it sounds
like the pre_process() routine is being called twice, somehow.  On the
first call the file is renamed.  Then on the second call, of course the
file is not there anymore (as it's been renamed) and thus it prints the
Exiting message.

Best,

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Walter Prins
Hi Alfonso,

I see you've had some responses yet -- I've not read them all, and am just
posting the following suggestion you might want to look at:

# read lines with keys into a list
selected_keys=open('A.txt', 'r').readlines()
# read all data records into another list
records=open('B.txt', 'r').readlines()

# Now use a list comprehension to return the required entries, the i+1th
entries for all i indexes in the records
# list that corresponds to a key in the keys list:
selected_values = [(records[i], records[i+1]) for i, row in
enumerate(records) if row in selected_keys]

# The above returns both the key and the value, in a tuple, if you just
want the value rows only then the above becomes:
#selected_values = [records[i+1] for i, row in enumerate(records) if row in
selected_keys]

# Finally print the result.
print selected_values

You'll note I read both files into memory, even though you say your files
are largish.  I don't consider 500MB to be very large in this day and age
of 4+GB PC's, which is why I've basically ignored the large issue.  If
this is not true in your case then you'll have to post back.

Good luck,

Walter
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] odd behavior when renaming a file

2012-05-09 Thread Peter Otten
Joel Goldstick wrote:

 import os
 def pre_process():
 if os.path.isfile('revelex.csv'):
 os.rename('revelex.csv', 'revelex.tmp')
 print Renamed ok
 else:
 print Exiting, no revelex.csv file available
 exit()
 out_file = open('revelex.csv', 'w')
 # etc.
 
 if __name__ == '__main__':
 pre_process()
 
 
 When I run the code above it works file if run from the file.  But
 when I import it and run it from another file it renames the file but
 then prints Exiting, no revelex.csv file available

Add 

print os.getcwd() 

to your code, you are probably in the wrong directory.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Script to search in string of values from file A in file B

2012-05-09 Thread Alan Gauld

On 09/05/12 20:28, aduarte wrote:


Sorry it seems that I got the wrong mailing list to subscribe ...
I got the idea that this list was open to newbies ... by the answers I
got I see that I was wrong


I'm not sure what you mean. The answers you got seem to have provided 
the answers to your questions. What more were you expecting?



after chatting in other mailing lists about other languages I realized
that this mailing list is not in my league for python ...


Which league is that? You said you were a beginner so you got answers 
appropriate to a beginner. If you said you were an experienced data 
processing professional looking for a smart/efficient way to process 
large files using Python you would likely have gotten different answers.
If the answers were too advanced then by all means ask for 
clarification. We can only guess your level based on what you post.



Interestingly I did got a strange advice from this list: try awk ... of
Perl for the job, as Python is kind of tricky to print the next line


I didn't see that suggestion and I disagree with it.
Python is just as capable of processing files as awk or Perl as I hope 
the other answers have demonstrated. But where another tool is more 
appropriate there is no harm in suggesting it. Just because this is a 
Python list doesn't mean the answer needs to be Python.



that you selected (yes that was my question and I still don't understand
how ppl advise me to insert new lines in 500Mb files and so on to do it...)


Again I'm not sure that anyone is actually suggesting you insert new 
lines into your file. It's certainly not the general advice being given.


But this is a list for beginners and the people giving the advice
range from complete novices themselves to working pro's. The answers 
reflect that diversity.


In your case the majority of the answers have come from experienced
programmers giving you sound advice and probing your requirements to
ensure that all your use cases are covered. The only slightly
radical suggestion I can see is to read the files into memory - and on a 
modern PC that's not too radical for a 500M file even though I'd 
probably not do it myself...


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor