Hi,

I thought it might be useful for others to have a step-by-step account
of how to import a whole experiment with the new core batch importers,
so here it goes:

My aim was to import the hybridization-related information for a whole
experiment in a way that would provide MIAME-compliance for this aspect
of the data. I also wanted to see if I can do this working from a single
spreadsheet (rather than having separate ones for each importer), and
one can, which is great.

The column headers of the spreadsheet I used looked like this in the
end:

RawBioAssay     FileName        ArrayName       ArrayBatch
ArraySlide      Platform        RawDataType     Scan    Hybridization
LabeledExtract  Dye     Extract Sample  BioSource
Protocol[image_analysis]        Protocol[scanning]
Protocol[hybridization] Protocol[labeling]      Protocol[extraction]
Protocol[treatment]     StrainOrLine    Time

In this case the last two columns contained annotation (experimental
factors) specific to my experiment. 

Using this spreadsheet (and suitable import configs), I ran each of the
batch importers for BioSource, Sample, Extract, Labelled Extract,
Hybridization, Scan and Raw Bioassay, in this order. I had to first
manually create a project, protocols and an array design, but that's
fine since it is infrequent stuff, compared to the other entities. 

The last thing I did was to run the annotation batch importer on my new
raw bioassays, which works but is not ideal because of the lack of
inheritance from the appropriate entities upstream (this will be fixed
in BASE 2.9 though, see this thread:
http://www.mail-archive.com/basedb-users@lists.sourceforge.net/msg01596.
html). 

All in all the import of the hybs using the batch importers only takes
about 10 minutes -- that's getting very acceptable. Nice work, guys!! 

There is still some manual repetition involved but one could get round
this by writing a fairly simple plugin that just calls all the other
batch importers in turn. I'll add that to my TODO list but it might take
me some time to get round to this as I am snowed under with lots of
other stuff. 

Attached below is a more detailed point-by-point walk-through of what I
did. Hope this is of use. 

cheers 
Micha


1.      Create all required protocols manually or check suitable
protocols already exist.
2.      Create new array design manually or check suitable design
already exists.
3.      Create a new project with default settings for platform, raw
data type, array design and protocols. These will be associated  with
all entities created from here on.
4.      Set the new project active - this will make it the current
project.
5.      Format your hybridization data as per example above and save as
tab delimited text. Make sure that the names of existing entities you
refer to in the spreadsheet match those in the database, if you are
planning on matching by name. Upload this file to BASE.
6.      Upload raw data files to BASE and unzip in suitable directory
(if you want to have the files associated). (N.B. This example here does
not include storing raw data in the database) 
7.      Create a suitable import configurations for each of the batch
importers - this only needs to be done once if the same spreadsheet
format is used for later imports. 
8.      Batch-import all required entities by selecting the list view of
each of them in turn and running their respective batch import plugin
with the spreadsheet as input - import configs should be detected
automatically. It's best to stick to the following order:

a.      BioSource
b.      Sample
c.      Extract
d.      Labelled Extract
e.      Hybridization
f.      Scan
g.      Raw Bioassay

9.      Select all newly created bioassays and Click "New
Experiment...". This will associate all selected bioassays with the new
experiment.

ANNOTATION
This is a (fairly dirty) temporary workaround which does not use
inheritance. From BASE 2.9 on it should be possible to use inheritance
with the mass annotation importer. 

10.     Check that suitable annotation types (= experimental factors)
exist (Administrate -> Types -> Annotation Types) or create new ones
with names that match the entries in the spreadsheet. 
11.     Batch-annotate all raw bioassays in the experiment. In the list
view of the Raw Bioassays, select "Import..." and then select the
Annotation Importer from the list of plugins available. A suitable
import config should be detected automatically. This will annotate each
RawBioassay with the appropriate factor value combination. 


==================================
Dr Micha M Bayer
Bioinformatics Specialist
Genetics Programme
The Scottish Crop Research Institute
Invergowrie
Dundee
DD2 5DA
Scotland, UK
Telephone +44(0)1382 562731 ext. 2309
Fax +44(0)1382 562426
http://www.scri.ac.uk/staff/michabayer
==================================
 


______________________________________________________________________
SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are
confidential

to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that
addressee.
If you are not the intended recipient you are requested to preserve this

confidentiality and you must not use, disclose, copy, print or rely on
this 
e-mail in any way. Please notify [EMAIL PROTECTED] quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan
the email and the attachments (if any).
______________________________________________________________________

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject "unsubscribe" to
[EMAIL PROTECTED]

Reply via email to