array and manipulation

spir Tue, 11 Nov 2008 04:41:00 -0800

Triantafyllos Gkikopoulos a écrit :
> Hi,
>
>  Thanks for the advice,
>
> I will do some more reading this week and look into your solution as
> well as others.
>
>   So basicaly my data is enrichement signal on yeast genomic locations,
> and want to map this signal in respect to genes, and the averagomg
> question is so that I can averga signal if I align all the signal based
> on the start of every gene.
> I guess another solution is to create an array with zeros that covers
> then entire genome and then I replace the zeros with actual signal (int)
> values, then I be able to call for individual locations within this
> array and maybe easier to do the averaging as well based on the
> reference file.

There are numerous solutions for any problem. This one would be especiallynon-pythonic, I guess ;-)

If I understand your problem:
* Locations are gene ids.
* They are key fields for your data -- strings and ints are not keys.
* There can be several data items for a unique id.
* Among the possible data, integers have to be processed (averaged).
* What about string?

If you want to be both pythonic and simple, as I see it, use a dict withlocations as keys. Now, the data seems to be mainly a list of ints. Right? So,use a list for this, and add to it the relevant storing fields for additionaldata (strings?), and the relevant method to average your integers. Example:


class GeneData(list):
        ''' holds int values in basic list
                calculates average value
                stores additional string data
                '''
        def store_strings(self,strings):
                self.strings = strings
        def store_string(self,string):
                self.strings.append(string)
        def average(self):
                # record and/or return average, eg:
                sum = 0.0
                for i in self:
                        sum += i
                self.avrg = sum/len(self)
                return self.avrg

gd = GeneData([1,2,3])
gd.append(4), gd.append(5)
x = gd.pop()
gd.store_strings(["string","data"])
gd.store_string("i'm relevant info")

print gd, gd.strings
print "average: %2.2f ; removed: %i" %(gd.average(), x)
==>
[1, 2, 3, 4] ['string', 'data', "i'm relevant info"]
average: 2.5 ; removed: 5
        
denis

> cheers
>
> Dr Triantafyllos Gkikopoulos
>>>> spir <[EMAIL PROTECTED]> 11/10/08 7:55 PM >>>
> trias a écrit :
>  > Hi,
>  >
>  >  I have started learning python (any online help content suggestions
> are
>  > welcome) and want to write a couple of scripts to do simple numeric
>  > calculations on array data.
>  >
>  > filetype(1) I have reference files (ie file.csv) that contain three
> columns
>  > with variable rows, first column is type str contains a unique
> identifier
>  > name, and the other two columns are int type contain two reference
> values
>  > (start,stop(genomic location reference values).
>  >   **maybe I should import this as dictionary list**
>  >
>  > filetype(2) The other file contains signal data in three columns,
> column one
>  > is a unique identifier type int, and the other two columns contain
> two type
>  > int values (genomic location reference values)
>  >   ** import this as array/list
>
> For both files, field 1 contains an id. So that using a dictionary seems
>
> appropriate. You may use a format like:
> {id:(start,stop)}
> Location could also be stored in a custom type, especially if you need
> to
> compare location (which is probably the case). Example (not tested):
> class Location(object):
>    def __init__(self, start, stop):
>            delf.start = start
>            self.stop = stop
>    def __eq__(self, other):
>            return (self.start==other.start) and
> (self.stop==other.stop)
> The second method will be called when you test loc1==loc2 and will
> return True
> iif both positions are equal.
> This custom type allows you to define other methods that be relevant for
> your
> problem.
>  > I want to map the location of filetype(2) with respect to
> filetype(1)...
>
> Here is the problem reversed: if the location is to be used as link
> between
> tables, then it should be the key of both tables:
> {location:id}
> Fortunately, your location is a simple enough set of data to be stored
> as a
> (start,stop) tuple, so that you can actually use it as dict key (a
> dict's key
> must be of an immutable type).
> Now, the question is: do you have multiple occurences of the same
> location. If
> yes, you will have to agglomerate the data in eg a list:
> {location:[d1,d2,...]}
> But, maybe I don't properly undestand what you have to do (see Q below).
>  > ...and be
> able to do averaging of signal if I align all filetype one objects.
>
> Where/what are the data fields in your pattern?
>
> Denis
>
>  > Thanks
>
>
>
>
> The University of Dundee is a registered Scottish charity, No: SC015096
>
>



_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] import data (txt/csv) into list/array and manipulation

Reply via email to