Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-21 Thread trias

Cool,

 Does anyone else have any other thoughts on this problem?








-- 
View this message in context: 
http://www.nabble.com/import-data-%28txt-csv%29-into-list-array-and-manipulation-tp20424075p20623480.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-20 Thread trias

Hi,

 so for this part of the problem it goes a bit like this:

 I have a CSV file (file1) that contains three columns, column one contains
a unique ID type str,
columns two and three contain start and stop coordinates type int. 
  the other file (file2) contains two columns, column one contains a single
coordinate type int and the second column contains a value type float.

 What I would like to do is for example be able to grab the values from
file2 that lies within range defind by the start,stop coordinates associated
with an ID from file1.

  But most importantly I would like to be able to grab say the values from
file1 that are from range((start-300),start) for every single ID in file1, I
guess plot them in an array and then calculate the sum/ of these values and
plot them, ie for ob1 in file get values from range((1025-300),1025), for
ob2((1090-300),1090) for ob3((2200-300),2200) and then plot/calculate the
sum assuming the have the same start coordinate, so x axis would be (step)
values from 0-300 and y axis would be the sum of values from ob1,2,3 for
every single step value from 0-300.

 does this make sense/

cheers http://www.nabble.com/file/p20599488/file1.csv file1.csv 
http://www.nabble.com/file/p20599488/file2.csv file2.csv 

Kent Johnson wrote:
 
 On Thu, Nov 13, 2008 at 9:50 AM, trias [EMAIL PROTECTED] wrote:
 PS I could maybe upload a couple of small example flies or a schematic to
 see what I mean
 
 A small example would be very helpful. Also please subscribe to the list.
 
 Kent
 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor
 
 

-- 
View this message in context: 
http://www.nabble.com/import-data-%28txt-csv%29-into-list-array-and-manipulation-tp20424075p20599488.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-20 Thread A.T.Hofkamp

trias wrote:

Hi,

 so for this part of the problem it goes a bit like this:

 I have a CSV file (file1) that contains three columns, column one contains
a unique ID type str,
columns two and three contain start and stop coordinates type int. 
  the other file (file2) contains two columns, column one contains a single

coordinate type int and the second column contains a value type float.

 What I would like to do is for example be able to grab the values from
file2 that lies within range defind by the start,stop coordinates associated
with an ID from file1.

  But most importantly I would like to be able to grab say the values from
file1 that are from range((start-300),start) for every single ID in file1, I
guess plot them in an array and then calculate the sum/ of these values and
plot them, ie for ob1 in file get values from range((1025-300),1025), for
ob2((1090-300),1090) for ob3((2200-300),2200) and then plot/calculate the
sum assuming the have the same start coordinate, so x axis would be (step)
values from 0-300 and y axis would be the sum of values from ob1,2,3 for
every single step value from 0-300.

 does this make sense/


mostly, although you lost me when you started talking about ranges.

Computer programming is often about making small steps at a time (trying to do 
everything at the same time tends to make yourself get lost in what to do first).
In your case, I'd start with reading your first csv file (with the csv Python 
module) into memory. Once you have done that, get for example a list of 
start/stop coordinates from the loaded data.
Then start loading the second csv file, see how you can find a value, and then 
a range of values.


Once you have done that, you can implement you first objective.

After that start thinking about storing in arrays, plotting, etc.


Sincerely,
Albert
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-13 Thread trias

Hi again,
 
I got a bit better in python the last few days, but looking at some codes it
almost looks impossible to catch up. but definitely want to fight it, looks
well worth the effort, plus it probably works exponentially :)

 I read a little bit about the interval/segment trees, and it looks that
their efficiency lies in the efficiency of the algorithms associated with
the lookup/indexing modules.

 Now although I am too newbie to be able to implement the code from the
bx-python guys (quicksect.py)
understand some basics, disecting a list of objects str,int(start),int(end)
on a median basis and store information on the nodes etc

Assuming I get this to work some time, and I get back a list of intervals of
interest. I would like to use these intervals (str,int,int) to search in a
file that contains a fixed step range, where its int in that range is
associated with an int(value) (probably best format this file as a
dictionary=signaldict) to call all keys within range(interval) and plot
values. 

 I think it would be better to print these values in another array, so that
I can then say sum the values from all the intervals for each step in the
range (assuming I have exported a fixed length of keys from the signaldict)
and plot in a graph

 Well don't mean to have the problem solved for me, but if you fancy to
contribute with any kind of help you are welcome

cheers

PS I could maybe upload a couple of small example flies or a schematic to
see what I mean
 

 
-- 
View this message in context: 
http://www.nabble.com/import-data-%28txt-csv%29-into-list-array-and-manipulation-tp20424075p20481629.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-13 Thread Kent Johnson
On Thu, Nov 13, 2008 at 9:50 AM, trias [EMAIL PROTECTED] wrote:
 PS I could maybe upload a couple of small example flies or a schematic to
 see what I mean

A small example would be very helpful. Also please subscribe to the list.

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-11 Thread trias

Hi all,

 Thanks so much for the help,

I will have a look at the suggestions as well as the other thread,links this
week and should post here when I have tried them/need more help.

Thanks
-- 
View this message in context: 
http://www.nabble.com/import-data-%28txt-csv%29-into-list-array-and-manipulation-tp20424075p20435477.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-11 Thread spir

Triantafyllos Gkikopoulos a écrit :
 Hi,

  Thanks for the advice,

 I will do some more reading this week and look into your solution as
 well as others.

   So basicaly my data is enrichement signal on yeast genomic locations,
 and want to map this signal in respect to genes, and the averagomg
 question is so that I can averga signal if I align all the signal based
 on the start of every gene.
 I guess another solution is to create an array with zeros that covers
 then entire genome and then I replace the zeros with actual signal (int)
 values, then I be able to call for individual locations within this
 array and maybe easier to do the averaging as well based on the
 reference file.

There are numerous solutions for any problem. This one would be especially 
non-pythonic, I guess ;-)

If I understand your problem:
* Locations are gene ids.
* They are key fields for your data -- strings and ints are not keys.
* There can be several data items for a unique id.
* Among the possible data, integers have to be processed (averaged).
* What about string?

If you want to be both pythonic and simple, as I see it, use a dict with 
locations as keys. Now, the data seems to be mainly a list of ints. Right? So, 
use a list for this, and add to it the relevant storing fields for additional 
data (strings?), and the relevant method to average your integers. Example:


class GeneData(list):
''' holds int values in basic list
calculates average value
stores additional string data
'''
def store_strings(self,strings):
self.strings = strings
def store_string(self,string):
self.strings.append(string)
def average(self):
# record and/or return average, eg:
sum = 0.0
for i in self:
sum += i
self.avrg = sum/len(self)
return self.avrg

gd = GeneData([1,2,3])
gd.append(4), gd.append(5)
x = gd.pop()
gd.store_strings([string,data])
gd.store_string(i'm relevant info)

print gd, gd.strings
print average: %2.2f ; removed: %i %(gd.average(), x)
==
[1, 2, 3, 4] ['string', 'data', i'm relevant info]
average: 2.5 ; removed: 5

denis

 cheers

 Dr Triantafyllos Gkikopoulos
 spir [EMAIL PROTECTED] 11/10/08 7:55 PM 
 trias a écrit :
   Hi,
  
I have started learning python (any online help content suggestions
 are
   welcome) and want to write a couple of scripts to do simple numeric
   calculations on array data.
  
   filetype(1) I have reference files (ie file.csv) that contain three
 columns
   with variable rows, first column is type str contains a unique
 identifier
   name, and the other two columns are int type contain two reference
 values
   (start,stop(genomic location reference values).
 **maybe I should import this as dictionary list**
  
   filetype(2) The other file contains signal data in three columns,
 column one
   is a unique identifier type int, and the other two columns contain
 two type
   int values (genomic location reference values)
 ** import this as array/list

 For both files, field 1 contains an id. So that using a dictionary seems

 appropriate. You may use a format like:
 {id:(start,stop)}
 Location could also be stored in a custom type, especially if you need
 to
 compare location (which is probably the case). Example (not tested):
 class Location(object):
def __init__(self, start, stop):
delf.start = start
self.stop = stop
def __eq__(self, other):
return (self.start==other.start) and
 (self.stop==other.stop)
 The second method will be called when you test loc1==loc2 and will
 return True
 iif both positions are equal.
 This custom type allows you to define other methods that be relevant for
 your
 problem.
   I want to map the location of filetype(2) with respect to
 filetype(1)...

 Here is the problem reversed: if the location is to be used as link
 between
 tables, then it should be the key of both tables:
 {location:id}
 Fortunately, your location is a simple enough set of data to be stored
 as a
 (start,stop) tuple, so that you can actually use it as dict key (a
 dict's key
 must be of an immutable type).
 Now, the question is: do you have multiple occurences of the same
 location. If
 yes, you will have to agglomerate the data in eg a list:
 {location:[d1,d2,...]}
 But, maybe I don't properly undestand what you have to do (see Q below).
   ...and be
 able to do averaging of signal if I align all filetype one objects.

 Where/what are the data fields in your pattern?

 Denis

   Thanks




 The University of Dundee is a registered Scottish charity, No: SC015096





___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] import data (txt/csv) into list/array and manipulation

2008-11-10 Thread trias

Hi,

 I have started learning python (any online help content suggestions are
welcome) and want to write a couple of scripts to do simple numeric
calculations on array data.

filetype(1) I have reference files (ie file.csv) that contain three columns
with variable rows, first column is type str contains a unique identifier
name, and the other two columns are int type contain two reference values
(start,stop(genomic location reference values).
  **maybe I should import this as dictionary list**

filetype(2) The other file contains signal data in three columns, column one
is a unique identifier type int, and the other two columns contain two type
int values (genomic location reference values)
  ** import this as array/list

I want to map the location of filetype(2) with respect to filetype(1) and be
able to do averaging of signal if I align all filetype one objects.

Thanks
-- 
View this message in context: 
http://www.nabble.com/import-data-%28txt-csv%29-into-list-array-and-manipulation-tp20424075p20424075.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-10 Thread Tim Michelsen



filetype(2) The other file contains signal data in three columns, column one
is a unique identifier type int, and the other two columns contain two type
int values (genomic location reference values)
  ** import this as array/list

I want to map the location of filetype(2) with respect to filetype(1) and be
able to do averaging of signal if I align all filetype one objects.

Thanks



import numpy as np

data = np.loadtxt('file.csv', dtype='|S10')
col1 = date[0]
col2 = date[1].astype(int)
...

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] import data (txt/csv) into list/array and manipulation

2008-11-10 Thread Kent Johnson
On Mon, Nov 10, 2008 at 12:12 PM, trias [EMAIL PROTECTED] wrote:

  I have started learning python (any online help content suggestions are
 welcome) and want to write a couple of scripts to do simple numeric
 calculations on array data.

Welcome! Have you seen
http://wiki.python.org/moin/BeginnersGuide/NonProgrammers

 filetype(1) I have reference files (ie file.csv) that contain three columns
 with variable rows, first column is type str contains a unique identifier
 name, and the other two columns are int type contain two reference values
 (start,stop(genomic location reference values).
  **maybe I should import this as dictionary list**

I don't know what a dictionary list is, do you mean a list of
dictionaries? I think a list of lists is probably fine.

Python comes with a csv module that helps to read csv files. Then you
will have to convert the second two columns from string to int.

 filetype(2) The other file contains signal data in three columns, column one
 is a unique identifier type int, and the other two columns contain two type
 int values (genomic location reference values)
  ** import this as array/list

 I want to map the location of filetype(2) with respect to filetype(1) and be
 able to do averaging of signal if I align all filetype one objects.

I don't know what you mean by this. I guess you want to search within
filetype(1) for intervals that contain the locations from filetype(2)
? This is pretty straightforward but if you have long lists it may be
slow. This recent thread has some suggestions for speeding up
searching large data sets:
http://thread.gmane.org/gmane.comp.python.tutor/51162/focus=51181

It looks like you and Srinivas are trying to solve similar problems.

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor