[Tutor] Problem on filtering data

2015-06-08 Thread jarod_v6--- via Tutor
Dear All;
I have a very silly problem.




with open(Dati_differenzialigistvsminigist_solodiff.csv) as p:
for i in p:
lines = i.strip(\n).split(\t)
if lines[8] != NA:
if lines[8] :
print lines[8]

Why I continue to obtain  empity line?

baseMeanlog2FoldChangelfcSEstatpvaluepadj
ensemblhgnc_symboluniprotentrez
ENSG000146049.2127074325806-1.230249313832590.386060601796602 
   -3.186674082015650.001439188499137720.0214436050108864
ENSG0001460STPG1Q5TH7490529
ENSG0001631104.058286326346-0.805557044512410.294010285837035 
   -2.739894089824180.006145898532814860.0574525590840568
ENSG0001631KRIT1O00522889
ENSG00029331777.439389337051.586302553551380.608489070400574  
  2.606953239944140.009135183446056390.0740469624219782
ENSG0002933TMEM176AQ96HP855365
ENSG00031377.120731227135742.212658945608880.501823258624614  
  4.409239523240291.03734243663752e-050.000620867030402752
ENSG0003137CYP26B1Q9NR6356603
ENSG00039898.98972643541858-1.191561953259440.467992084035133 
   -2.546115615857280.0108929103912960.0832138232974883
ENSG0003989SLC7A2P525696542
ENSG0004478352.6800097036971.077810136145450.371617391411002  
  2.900322108319780.003727793670877750.041335008661432
ENSG0004478FKBP4Q027902288
ENSG00047761808.49276145547-2.226919751099480.560563734648272 
   -3.972643275785177.10794561495787e-050.00243688669444854
ENSG0004776HSPB6O14558126393
ENSG0004779110.0665574143771.03716286291060.375665509210244   
 2.760867946304180.005764798038383960.0552052693506261ENSG0

thanks so much
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Problem on filtering data

2015-06-08 Thread Steven D'Aprano
On Mon, Jun 08, 2015 at 04:50:13PM +0200, jarod_v6--- via Tutor wrote:
 Dear All;
 I have a very silly problem.

The right way to handle CSV files is to use the csv module:

https://docs.python.org/2/library/csv.html
http://pymotw.com/2/csv/

Some more comments further below.


 with open(Dati_differenzialigistvsminigist_solodiff.csv) as p:
 for i in p:
 lines = i.strip(\n).split(\t)
 if lines[8] != NA:
 if lines[8] :
 print lines[8]
 
 Why I continue to obtain  empity line?

I don't know. What do you mean 'obtain  empty line'? Do you mean that 
every line is empty? Some of the lines are empty? One empty line at the 
end? How do you know it is empty? You need to explain more.

Perhaps put this instead:

print repr(lines[8])

to see what it contains. Perhaps there are invisible s p a c e s so it 
only looks empty.

Some more comments:

In English, we would describe open(Daticsv) as a file, and write 

with open(Daticsv) as f:  # f for file

I suppose in Italian you might say archivio, and write:

with open(Daticsv) as a:

But what is p?

The next line is also confusing:

for i in p:

for i in ... is used for loops over the integers 0, 1, 2, 3, ... and 
should not be used for anything else. Here, you loop over the lines in 
the file, so you should write:

for line in p:
fields = line.strip(\n).split(\t)

or perhaps columns instead of fields. But lines is completely 
wrong, because you are looking at one line at a time.



-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Problem on filtering data

2015-06-08 Thread Alan Gauld

On 08/06/15 15:50, jarod_v6--- via Tutor wrote:


with open(Dati_differenzialigistvsminigist_solodiff.csv) as p:


You are usually better off processing CSV files (or in your case tab 
separated) using the CSV module.



 for i in p:
 lines = i.strip(\n).split(\t)
 if lines[8] != NA:
 if lines[8] :
 print lines[8]

Why I continue to obtain  empity line?


What does empty line mean?
Do you get a line printed with no content?
Or no line printed?

I've no idea, but I do notice that you last line is not complete
 - ie it only has 8 fields. so lines[8] != NA should fail with an 
index error!


Also your headings only have 10 entries but your lines have 11?
Also you are processing the headings line in the same way as the rest of 
the data, is that correct?




baseMeanlog2FoldChangelfcSEstatpvaluepadjensembl
hgnc_symboluniprotentrez
ENSG000146049.2127074325806-1.230249313832590.386060601796602-3.18667408201565
0.001439188499137720.0214436050108864ENSG0001460STPG1Q5TH74
90529
ENSG0004779110.0665574143771.03716286291060.375665509210244
2.760867946304180.005764798038383960.0552052693506261ENSG0


HTH

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor