Triantafyllos Gkikopoulos a écrit :
Hi,
Thanks for the advice,
I will do some more reading this week and look into your solution as
well as others.
So basicaly my data is enrichement signal on yeast genomic locations,
and want to map this signal in respect to genes, and the averagomg
question is so that I can averga signal if I align all the signal based
on the start of every gene.
I guess another solution is to create an array with zeros that covers
then entire genome and then I replace the zeros with actual signal (int)
values, then I be able to call for individual locations within this
array and maybe easier to do the averaging as well based on the
reference file.
There are numerous solutions for any problem. This one would be especially
non-pythonic, I guess ;-)
If I understand your problem:
* Locations are gene ids.
* They are key fields for your data -- strings and ints are not keys.
* There can be several data items for a unique id.
* Among the possible data, integers have to be processed (averaged).
* What about string?
If you want to be both pythonic and simple, as I see it, use a dict with
locations as keys. Now, the data seems to be mainly a list of ints. Right? So,
use a list for this, and add to it the relevant storing fields for additional
data (strings?), and the relevant method to average your integers. Example:
class GeneData(list):
''' holds int values in basic list
calculates average value
stores additional string data
'''
def store_strings(self,strings):
self.strings = strings
def store_string(self,string):
self.strings.append(string)
def average(self):
# record and/or return average, eg:
sum = 0.0
for i in self:
sum += i
self.avrg = sum/len(self)
return self.avrg
gd = GeneData([1,2,3])
gd.append(4), gd.append(5)
x = gd.pop()
gd.store_strings([string,data])
gd.store_string(i'm relevant info)
print gd, gd.strings
print average: %2.2f ; removed: %i %(gd.average(), x)
==
[1, 2, 3, 4] ['string', 'data', i'm relevant info]
average: 2.5 ; removed: 5
denis
cheers
Dr Triantafyllos Gkikopoulos
spir [EMAIL PROTECTED] 11/10/08 7:55 PM
trias a écrit :
Hi,
I have started learning python (any online help content suggestions
are
welcome) and want to write a couple of scripts to do simple numeric
calculations on array data.
filetype(1) I have reference files (ie file.csv) that contain three
columns
with variable rows, first column is type str contains a unique
identifier
name, and the other two columns are int type contain two reference
values
(start,stop(genomic location reference values).
**maybe I should import this as dictionary list**
filetype(2) The other file contains signal data in three columns,
column one
is a unique identifier type int, and the other two columns contain
two type
int values (genomic location reference values)
** import this as array/list
For both files, field 1 contains an id. So that using a dictionary seems
appropriate. You may use a format like:
{id:(start,stop)}
Location could also be stored in a custom type, especially if you need
to
compare location (which is probably the case). Example (not tested):
class Location(object):
def __init__(self, start, stop):
delf.start = start
self.stop = stop
def __eq__(self, other):
return (self.start==other.start) and
(self.stop==other.stop)
The second method will be called when you test loc1==loc2 and will
return True
iif both positions are equal.
This custom type allows you to define other methods that be relevant for
your
problem.
I want to map the location of filetype(2) with respect to
filetype(1)...
Here is the problem reversed: if the location is to be used as link
between
tables, then it should be the key of both tables:
{location:id}
Fortunately, your location is a simple enough set of data to be stored
as a
(start,stop) tuple, so that you can actually use it as dict key (a
dict's key
must be of an immutable type).
Now, the question is: do you have multiple occurences of the same
location. If
yes, you will have to agglomerate the data in eg a list:
{location:[d1,d2,...]}
But, maybe I don't properly undestand what you have to do (see Q below).
...and be
able to do averaging of signal if I align all filetype one objects.
Where/what are the data fields in your pattern?
Denis
Thanks
The University of Dundee is a registered Scottish charity, No: SC015096
___
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor