Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-11 Thread trias

Hi all,

 Thanks so much for the help,

I will have a look at the suggestions as well as the other thread,links this
week and should post here when I have tried them/need more help.

Thanks
-- 
View this message in context: 
http://www.nabble.com/import-data-%28txt-csv%29-into-list-array-and-manipulation-tp20424075p20435477.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-11 Thread spir

Triantafyllos Gkikopoulos a écrit :
 Hi,

  Thanks for the advice,

 I will do some more reading this week and look into your solution as
 well as others.

   So basicaly my data is enrichement signal on yeast genomic locations,
 and want to map this signal in respect to genes, and the averagomg
 question is so that I can averga signal if I align all the signal based
 on the start of every gene.
 I guess another solution is to create an array with zeros that covers
 then entire genome and then I replace the zeros with actual signal (int)
 values, then I be able to call for individual locations within this
 array and maybe easier to do the averaging as well based on the
 reference file.

There are numerous solutions for any problem. This one would be especially 
non-pythonic, I guess ;-)

If I understand your problem:
* Locations are gene ids.
* They are key fields for your data -- strings and ints are not keys.
* There can be several data items for a unique id.
* Among the possible data, integers have to be processed (averaged).
* What about string?

If you want to be both pythonic and simple, as I see it, use a dict with 
locations as keys. Now, the data seems to be mainly a list of ints. Right? So, 
use a list for this, and add to it the relevant storing fields for additional 
data (strings?), and the relevant method to average your integers. Example:


class GeneData(list):
''' holds int values in basic list
calculates average value
stores additional string data
'''
def store_strings(self,strings):
self.strings = strings
def store_string(self,string):
self.strings.append(string)
def average(self):
# record and/or return average, eg:
sum = 0.0
for i in self:
sum += i
self.avrg = sum/len(self)
return self.avrg

gd = GeneData([1,2,3])
gd.append(4), gd.append(5)
x = gd.pop()
gd.store_strings([string,data])
gd.store_string(i'm relevant info)

print gd, gd.strings
print average: %2.2f ; removed: %i %(gd.average(), x)
==
[1, 2, 3, 4] ['string', 'data', i'm relevant info]
average: 2.5 ; removed: 5

denis

 cheers

 Dr Triantafyllos Gkikopoulos
 spir [EMAIL PROTECTED] 11/10/08 7:55 PM 
 trias a écrit :
   Hi,
  
I have started learning python (any online help content suggestions
 are
   welcome) and want to write a couple of scripts to do simple numeric
   calculations on array data.
  
   filetype(1) I have reference files (ie file.csv) that contain three
 columns
   with variable rows, first column is type str contains a unique
 identifier
   name, and the other two columns are int type contain two reference
 values
   (start,stop(genomic location reference values).
 **maybe I should import this as dictionary list**
  
   filetype(2) The other file contains signal data in three columns,
 column one
   is a unique identifier type int, and the other two columns contain
 two type
   int values (genomic location reference values)
 ** import this as array/list

 For both files, field 1 contains an id. So that using a dictionary seems

 appropriate. You may use a format like:
 {id:(start,stop)}
 Location could also be stored in a custom type, especially if you need
 to
 compare location (which is probably the case). Example (not tested):
 class Location(object):
def __init__(self, start, stop):
delf.start = start
self.stop = stop
def __eq__(self, other):
return (self.start==other.start) and
 (self.stop==other.stop)
 The second method will be called when you test loc1==loc2 and will
 return True
 iif both positions are equal.
 This custom type allows you to define other methods that be relevant for
 your
 problem.
   I want to map the location of filetype(2) with respect to
 filetype(1)...

 Here is the problem reversed: if the location is to be used as link
 between
 tables, then it should be the key of both tables:
 {location:id}
 Fortunately, your location is a simple enough set of data to be stored
 as a
 (start,stop) tuple, so that you can actually use it as dict key (a
 dict's key
 must be of an immutable type).
 Now, the question is: do you have multiple occurences of the same
 location. If
 yes, you will have to agglomerate the data in eg a list:
 {location:[d1,d2,...]}
 But, maybe I don't properly undestand what you have to do (see Q below).
   ...and be
 able to do averaging of signal if I align all filetype one objects.

 Where/what are the data fields in your pattern?

 Denis

   Thanks




 The University of Dundee is a registered Scottish charity, No: SC015096





___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor