Re: [Tutor] Extracting columns from many files to different files
They are separated by tabs > Date: Mon, 16 Jul 2012 12:16:22 -0400 > Subject: Re: [Tutor] Extracting columns from many files to different files > From: joel.goldst...@gmail.com > To: taser...@gmail.com > CC: susana...@hotmail.com; tutor@python.org > > On Mon, Jul 16, 2012 at 12:11 PM, taserian wrote: > > On Mon, Jul 16, 2012 at 10:58 AM, susana moreno colomer > > wrote: > >> > >> Hi! > >> I have a folder, with the following text files with columns: > >> > >> bb_ 1 > >> bb_2 > >> ww_1 > >> ww_2 > >> ff_1 > >> ff_2 > >> > >> What I want to do is: > >> > >> Extract columns 5,6, 8 from files bb_ > >> Extract columns 3,4 from files ww_ > >> Get 5 files, corresponding to different columns: > >> Files (excel files): 'ro' with colums number 5, 'bf' with colums number > >> 6, 'sm' with column 8, 'se' with columns number 3 and 'dse' with columns > >> number 4 > > > > How are these columns separated? Blank spaces, tabs, commas? > > > > I'm mostly worried about: > > > > > > for b in line: > > A.append(b[5].strip()) > > B.append(b[6].strip()) > > C.append(b[8].strip()) > > > > For the A List, this will take the 5th character from the line, not the 5th > > column. You may need to split the line based on the separators. > > > > AR > > > > ___ > > Tutor maillist - Tutor@python.org > > To unsubscribe or change subscription options: > > http://mail.python.org/mailman/listinfo/tutor > > > To make your code show correctly you must set your email program to > write text and not html. You should also set your text editor to turn > tabs into 4 spaces. Tabs work in python, but you can't mix tabs and > spaces, so it is less of a problem if you only use spaces for > indenting > > > -- > Joel Goldstick ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Extracting columns from many files to different files
Hi! I have a folder, with the following text files with columns: bb_ 1 bb_2 ww_1 ww_2 ff_1 ff_2 What I want to do is: Extract columns 5,6, 8 from files bb_ Extract columns 3,4 from files ww_ Get 5 files, corresponding to different columns: Files (excel files): 'ro' with colums number 5, 'bf' with colums number 6, 'sm' with column 8, 'se' with columns number 3 and 'dse' with columns number 4 I was following the example from: http://mail.python.org/pipermail/tutor/2009-February/067391.html import os import fnmatch import csv path = '//(my path)/' files=os.listdir(path) csv_out=csv.writer(open('out1.csv', 'wb'),delimiter=' ') ro=[]; bf=[]; sm=[]; se=[]; dse=[] listofcolumns=[] def column(tofile): for infile in files: filename= os.path.join(path,infile) f=open(filename, 'r') for line in f.readlines(): b=line.split('\t') if fnmatch.fnmatch(filename, 'bb_*'): A=[]; B=[]; C=[]; for b in line: A.append(b[5].strip()) B.append(b[6].strip()) C.append(b[8].strip()) ro.append(A) bf.append(B) sm.append(C) elif fnmatch.fnmatch(filename, 'ww_*'): D=[]; E=[] for b in line: D.append(b[3]) E.append(b[4]) se.append(D) dse.append(E) f.close() #list of pairs of (value list, name) listofcolumns=[(ro, 'ro'),(bf, 'bf'),(sm, 'sm'),(se, 'se'),(dse,'dse')] for values, name in listofcolumns: output=open(name + '.csv', 'w') for value in values: csv_out.writerows(rows) csv_out.writerows('\n') column(listofcolumns) It does't give me any error but gives me the documents in blank Another question, Is there anyway to write the code properly ? I am sorry about the spaces, I can send a text file Many thanks! :D ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] get columns from txt file
Hi! I am sorry, but still I don't get it! I am trying with this (the code is attached) csv_out=csv.writer(open('out1.csv', 'wb'), delimiter=' ', quoting=csv.QUOTE_ALL, dialect='excel') and csv_out=csv.writer(open('out1.csv', 'wb'), dialect='excel-tab') When I open out1 with text editor, I get 6 columns, but when I open it with excel I get one column (6 numbers on each cell, how can I separate it it???) The files from wich I get the columns are txt files. Many thanks!! Date: Fri, 13 Jul 2012 00:28:07 -0700 Subject: Re: [Tutor] get columns from txt file From: marc.tompk...@gmail.com To: susana...@hotmail.com CC: dfjenni...@gmail.com; tutor@python.org On Thu, Jul 12, 2012 at 11:36 PM, susana moreno colomer wrote: Hi! I am trying this, but still I get 6 numbers per cell. The only one difference is that I get a comma between numbers instead an space. I am opening the document also with excel Many thanks, Susana CSV stands for Comma Separated Values. Those commas are the separators between cells. Your current problem is that a CSV file is just a regular text file (that happens to have a lot of commas in it), and Excel is trying to read it as a normal text file. CSV is about the simplest way ever invented to store tabular data in a file, but there are some complicating factors. The most obvious is: what happens if the data you want to store in your file actually contains commas (e.g. address fields with "city, state zip", or numbers with thousands separators, etc.) One way around the problem is to put quotes around fields, and then separate the fields with commas (but then, what if your data contains quotes?); another is to separate the fields with tab characters instead of commas (technically this isn't really a CSV file anymore, but the acronym TSV never caught on.) Excel's native flavor* of CSV is the oldest, simplest, and stupidest of all - just commas between fields, and newlines between records: 1, 2, 3, 4, 5 a, b, c, d, e Quotes-and-commas style: "1", "2", "3", "4,000,000", 5 "a", "b", "c", "Dammit, Janet", "e" Tab-separated (well, you'll just have to imagine; I don't feel like reconfiguring my text editor): 1 2 3 4 5 a b cde fghij and a bunch of others I can't think of right now. * Note: Excel will happily import a quotes-and-commas CSV file and display it normally - but if you export it to CSV, it will revert to the dumb bare-commas format. >From the Python csv module docs: To make it easier to specify the format of input and output records, specific formatting parameters are grouped together into dialects. A dialect is a subclass of the Dialect class having a set of specific methods and a single validate() method. So you can specify which dialect you want to read or write, and/or you can specify which delimiter(s) you want to use. Hope that helps... ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] extracting a column from many files
From: susana...@hotmail.com To: tutor@python.org Date: Thu, 12 Jul 2012 17:45:31 +0200 Subject: [Tutor] extracting a column from many files Hi! I want to extract from certain kind of files column number 5 and 6. I want to extract acolumns number 5 to one file, and anumber6 to another file. I was following this example http://mail.python.org/pipermail/tutor/2009-February/067400.html , and under my understanding I created my own code wich gives me the following errors: This code gives me 2 files:ro with 2 lines bf empty I attached my code, with two options (though I was trying wich much more) Could be awesome If I extrace them in an excel file, Any suggestion Many thanks!!! Susana ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor #! /usr/bin/env python #-*- coding: utf -8 -*- import os import fnmatch import csv path = '//../my_working_folder/' files=os.listdir(path) ro=[]; bf=[] A=[]; B=[] for infile in files: if fnmatch.fnmatch(infile, 'bb_*'): filename= os.path.join(path,infile) f=open(filename, 'r') ##OPTION ONE for line in f.readlines(): b=line.split('\t') for b in line: A.extend('\t'.join.b[5]) B.extend('\t'.join.b[6]) ro.extend(A) bf.extend(B) f.close() lists=[(ro, 'ro'), (bf, 'bf')] for values, name in lists: output=open( name + '.txt', 'w') for value in values: output.write(str(values)) output.write('\n') output.close ##OPTION TWO for line in f.readlines(): b=line.split('\t') A.extend(b[5]) B.extend(b[6]) ro.extend(A) bf.extend(B) #the same: I get Syntaxerror:invalid syntax lists=[(ro, 'ro'), (bf, 'bf')] for values, name in lists: output=open( name + '.txt', 'w') for value in values: output.write(str(values)) output.write('\n') output.close()___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] get columns from txt file
Hi! I am trying this, but still I get 6 numbers per cell. The only one difference is that I get a comma between numbers instead an space. I am opening the document also with excel Many thanks, Susana Subject: Re: [Tutor] get columns from txt file From: dfjenni...@gmail.com Date: Thu, 12 Jul 2012 12:14:38 -0400 CC: tutor@python.org To: susana...@hotmail.com On Jul 12, 2012, at 12:06 PM, susana moreno colomer wrote: Hi! This code is working fine! The only one little thing is that the 6 columns appear together in one column, what means, in eac cell I get 6 numbers. How can I get tit in 6 excel columns? Programming is hard, so don't feel bad that I'm having to tell you again to specify the dialect as "excel" or "excel-tab": csv_out=csv.writer(open('out14.csv', 'wb'), dialect='excel') Take care, Don ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] get columns from txt file
Hi! This code is working fine! The only one little thing is that the 6 columns appear together in one column, what means, in eac cell I get 6 numbers. How can I get tit in 6 excel columns? Many thanks, Susana Subject: Re: [Tutor] get columns from txt file From: dfjenni...@gmail.com Date: Thu, 12 Jul 2012 11:43:02 -0400 CC: tutor@python.org To: susana...@hotmail.com On Jul 12, 2012, at 11:10 AM, susana moreno colomer wrote: Hi! It is attached on the email, called myprogram.txt and, here are the contents of that file: #! /usr/bin/env python import os import fnmatch import csv path = '//../my_working_folder/' csv_out=csv.writer(open('out14.csv', 'wb'), delimiter=' ') files=os.listdir(path) outfile=open('myfile1', 'w') Here you've opened a file called "outfile" which you never use! So, of course it's blank. Just remove this line. output=[] for infile in files: columnData = [] if fnmatch.fnmatch(infile, 'bb_*'): filename= os.path.join(path,infile) f=open(filename, 'r') for line in f: b=line.split('\t') # remove the next line output.append(b[5].strip()) # instead, append to columnData list columnData.append(b[5].strip()) f.close() # now append all of that data as a list to the output output.append(columnData) # now, as Kent said, create "a list of row lists from the list of column lists # if any of the column lists are too short they will be padded with None" rows = map(None, *output) # now, write those rows to your file csv_out.writerows(rows) This gives me a single column (I want 6, since I have 6 bb_files: csv_out.writerows(output) One of the things you missed in Kent's code is that the output is a list of lists. So, for each file you need a list which you then append to the output list. I've inserted the appropriate code above and you should have better luck if you copy and paste carefully. with this I get excel whitesheet def csvwriter(): excelfile=csv_out a=excelfile.append(output) c=excelfile.write(a) return c Right! As I explained in an earlier email, you never call the function csvwriter; you merely define it. The file is created when you opened it earlier for writing on line 10 of your program, but you never write anything to it. If you have the time and inclination, I recommend you work through one of the fine tutorials for python. It really is a great language and this is a **great** community of folks for people like me who are learning the language still. Take care, Don ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] extracting a column from many files
Hi! I have a group of files in a directory: bb_1.txt bb_2.txt bb_3.txt bb_4.txt bb_5.txt bb_6.txt ss_1.txt I want to extract from files whose names start with bb_ column number 5 and 6. I want to extract all columns number 5 to one file, and all tcolumns number6 to another file. I was following this example http://mail.python.org/pipermail/tutor/2009-February/067400.html , and I created my own code wich gives me the following errors: This code gives me 2 files: ro with 2 lines bf emty I attached my code, with two options (though I was trying wich much more) Could be awesome If I extrace them in an excel file, Any suggestion Many thanks!!! Susana#! /usr/bin/env python #-*- coding: utf -8 -*- import os import fnmatch import csv path = '//../my_working_folder/' files=os.listdir(path) ro=[]; bf=[] A=[]; B=[] for infile in files: if fnmatch.fnmatch(infile, 'bb_*'): filename= os.path.join(path,infile) f=open(filename, 'r') ##OPTION ONE for line in f.readlines(): b=line.split('\t') for b in line: A.extend('\t'.join.b[5]) B.extend('\t'.join.b[6]) ro.extend(A) bf.extend(B) f.close() lists=[(ro, 'ro'), (bf, 'bf')] for values, name in lists: output=open( name + '.txt', 'w') for value in values: output.write(str(values)) output.write('\n') output.close ##OPTION TWO for line in f.readlines(): b=line.split('\t') A.extend(b[5]) B.extend(b[6]) ro.extend(A) bf.extend(B) #the same: I get Syntaxerror:invalid syntax lists=[(ro, 'ro'), (bf, 'bf')] for values, name in lists: output=open( name + '.txt', 'w') for value in values: output.write(str(values)) output.write('\n') output.close()___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] get columns from txt file
Hi! It is attached on the email, called myprogram.txt Thank you! Subject: Re: [Tutor] get columns from txt file From: dfjenni...@gmail.com Date: Thu, 12 Jul 2012 11:07:53 -0400 CC: tutor@python.org To: susana...@hotmail.com On Jul 12, 2012, at 10:55 AM, susana moreno colomer wrote: Hi! I have a group of files in a directory: bb_1.txt bb_2.txt bb_3.txt bb_4.txt bb_5.txt bb_6.txt ss_1.txt I want to extract from files whose names start with bb_ column number 5 to an excel file. I have 6 bb_ files, therefore I want to get 6 columns (5th column from each file) This code is working,with the following errors: I get the data in only one column, instead of six Great! It sounds like you're almost there. Where's the code which works except that it puts the data all in one column? Take care, Don #! /usr/bin/env python import os import fnmatch import csv path = '//../my_working_folder/' csv_out=csv.writer(open('out14.csv', 'wb'), delimiter=' ') files=os.listdir(path) outfile=open('myfile1', 'w') output=[] for infile in files: if fnmatch.fnmatch(infile, 'bb_*'): filename= os.path.join(path,infile) f=open(filename, 'r') for line in f: b=line.split('\t') output.append(b[5].strip()) f.close() This gives me a single column (I want 6, since I have 6 bb_files: csv_out.writerows(output) with this I get excel whitesheet def csvwriter(): excelfile=csv_out a=excelfile.append(output) c=excelfile.write(a) return c___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] get columns from txt file
Hi! I have a group of files in a directory: bb_1.txt bb_2.txt bb_3.txt bb_4.txt bb_5.txt bb_6.txt ss_1.txt I want to extract from files whose names start with bb_ column number 5 to an excel file. I have 6 bb_ files, therefore I want to get 6 columns (5th column from each file) This code is working,with the following errors: I get the data in only one column, instead of six Many thanks! Subject: Re: [Tutor] get columns from txt file From: dfjenni...@gmail.com Date: Thu, 12 Jul 2012 10:26:30 -0400 CC: tutor@python.org To: susana...@hotmail.com Oops! Still you forgot to cc: the tutor list. It's really important because if someone (like me, for instance) steers you in the wrong direction, others will jump in with corrections. On Jul 12, 2012, at 9:48 AM, susana moreno colomer wrote: Hi! Many thanks! You're welcome. I see that your code is improving :>) Still I get an error: AttributeError: '_cvs.writer' object has no atribute 'append'. If you would like a response to that error, please send the exact code you tried which didn't work and the error message by copying/pasting. (See, I'm pretty sure that you typed in the error above since you misspelled attribute as 'atribute'.) I wanted to solve it with the Def function. And that's why you should send the original code and error. Now, you have a different problem, but it's a good time to clear up your confusion. Def is not a function. In fact, I don't know what "Def" is. Instead, the def statement—notice that it's all lower case—defines a function which you can call elsewhere in code. >>> def call_me(): ... print "inside the call_me function" ... >>> call_me() inside the call_me function Note, that there is no output from our defining the call_me function; it's only when we call it with the parentheses that the code inside of it is executed. Make sense? However, again, it's not necessary for this script. Once you have the csv_out variable, you should be able to Now I am trying this, though I get >import os >import fnmatch >import csv >path = '/This is my current directory/' >csv_out=csv.writer(open('out13.csv', 'wb'), delimiter=' ') >files=os.listdir(path) >outfile=open('myfile1', 'w') >output=[] >for infile in files: > if fnmatch.fnmatch(infile, 'bb_*'): > filename= os.path.join(path,infile) > f=open(filename, 'r') > for line in f: > b=line.split('\t') > output.append(b[5].strip()) > f.close() >Def csvwriter(): >gives me SyntaxError: invalid syntax >excelfile=csv_out >a=excelfile.append(output) >c=excelfile.write(a) >return c I am trying with this because the atribute 'writerows' didn't work. A different error. There are lots of reasons it might not have worked. Wish you had sent us the code for that one :<( We can't solve it without all the info. Is there another way to write in csv files? Absolutely. At the interactive prompt, dir() shows all the attributes of an object to find out your options: >>> csv_out=csv.writer(open('out13.csv', 'wb'), dialect='excel') >>> dir(csv_out) ['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'dialect', 'writerow', 'writerows'] See the writerow and writerows? Those look promising. Alternatively, while it takes a while to learn to read the python documentation, it really is your friend: http://docs.python.org/library/csv.html If you'd like further examples, Doug Hellman often provides a great resource with his Python Module of the Week (PYMTOW) series: http://www.doughellmann.com/PyMOTW/csv/ I don't know what you mean with specify dialect Something like this: csv_out = csv.writer(fileobject, dialect='excel') Take care, Don #! /usr/bin/env python import os import fnmatch import csv path = '//../my_working_folder/' csv_out=csv.writer(open('out14.csv', 'wb'), delimiter=' ') files=os.listdir(path) outfile=open('myfile1', 'w') output=[] for infile in files: if fnmatch.fnmatch(infile, 'bb_*'): filename= os.path.join(path,infile) f=open(filename, 'r') for line in f: b=line.split('\t') output.append(b[5].strip()) f.close() This gives me a single column (I want 6, since I have 6 bb_files: csv_out.writerows(output) with this I get excel whitesheet def csvwriter(): excelfile=csv_out a=excelfile.append(output) c=excelfile.write(a) return c___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] get columns from txt file
Hi! I have a group of files in a directory. I want to extract from files whose names start with bb_ column number 5 to an excel file. I have 6 bb_ files, therefore I want to get 6 columns (5th column from each file) This code is working,with the following errors: I get the data in only one column, instead of six I get white cell after every extracted cell I've got help from http://mail.python.org/pipermail/tutor/2004-November/033474.html, though it is not working for me This is my code: import os import fnmatch import csv path = '//..' files=os.listdir(path) csv_out=csv.writer(open('out.csv', 'w'), delimiter=' ') for infile in files: if fnmatch.fnmatch(infile, 'bb_*'): print infile filename= path+infile print filename f=open(filename) for line in f.readlines(): b=line.split('\t') csv_out.writerow(b[5]) f.close ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor