Hi, I thought it might be useful for others to have a step-by-step account of how to import a whole experiment with the new core batch importers, so here it goes:
My aim was to import the hybridization-related information for a whole experiment in a way that would provide MIAME-compliance for this aspect of the data. I also wanted to see if I can do this working from a single spreadsheet (rather than having separate ones for each importer), and one can, which is great. The column headers of the spreadsheet I used looked like this in the end: RawBioAssay FileName ArrayName ArrayBatch ArraySlide Platform RawDataType Scan Hybridization LabeledExtract Dye Extract Sample BioSource Protocol[image_analysis] Protocol[scanning] Protocol[hybridization] Protocol[labeling] Protocol[extraction] Protocol[treatment] StrainOrLine Time In this case the last two columns contained annotation (experimental factors) specific to my experiment. Using this spreadsheet (and suitable import configs), I ran each of the batch importers for BioSource, Sample, Extract, Labelled Extract, Hybridization, Scan and Raw Bioassay, in this order. I had to first manually create a project, protocols and an array design, but that's fine since it is infrequent stuff, compared to the other entities. The last thing I did was to run the annotation batch importer on my new raw bioassays, which works but is not ideal because of the lack of inheritance from the appropriate entities upstream (this will be fixed in BASE 2.9 though, see this thread: http://www.mail-archive.com/basedb-users@lists.sourceforge.net/msg01596. html). All in all the import of the hybs using the batch importers only takes about 10 minutes -- that's getting very acceptable. Nice work, guys!! There is still some manual repetition involved but one could get round this by writing a fairly simple plugin that just calls all the other batch importers in turn. I'll add that to my TODO list but it might take me some time to get round to this as I am snowed under with lots of other stuff. Attached below is a more detailed point-by-point walk-through of what I did. Hope this is of use. cheers Micha 1. Create all required protocols manually or check suitable protocols already exist. 2. Create new array design manually or check suitable design already exists. 3. Create a new project with default settings for platform, raw data type, array design and protocols. These will be associated with all entities created from here on. 4. Set the new project active - this will make it the current project. 5. Format your hybridization data as per example above and save as tab delimited text. Make sure that the names of existing entities you refer to in the spreadsheet match those in the database, if you are planning on matching by name. Upload this file to BASE. 6. Upload raw data files to BASE and unzip in suitable directory (if you want to have the files associated). (N.B. This example here does not include storing raw data in the database) 7. Create a suitable import configurations for each of the batch importers - this only needs to be done once if the same spreadsheet format is used for later imports. 8. Batch-import all required entities by selecting the list view of each of them in turn and running their respective batch import plugin with the spreadsheet as input - import configs should be detected automatically. It's best to stick to the following order: a. BioSource b. Sample c. Extract d. Labelled Extract e. Hybridization f. Scan g. Raw Bioassay 9. Select all newly created bioassays and Click "New Experiment...". This will associate all selected bioassays with the new experiment. ANNOTATION This is a (fairly dirty) temporary workaround which does not use inheritance. From BASE 2.9 on it should be possible to use inheritance with the mass annotation importer. 10. Check that suitable annotation types (= experimental factors) exist (Administrate -> Types -> Annotation Types) or create new ones with names that match the entries in the spreadsheet. 11. Batch-annotate all raw bioassays in the experiment. In the list view of the Raw Bioassays, select "Import..." and then select the Annotation Importer from the list of plugins available. A suitable import config should be detected automatically. This will annotate each RawBioassay with the appropriate factor value combination. ================================== Dr Micha M Bayer Bioinformatics Specialist Genetics Programme The Scottish Crop Research Institute Invergowrie Dundee DD2 5DA Scotland, UK Telephone +44(0)1382 562731 ext. 2309 Fax +44(0)1382 562426 http://www.scri.ac.uk/staff/michabayer ================================== ______________________________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify [EMAIL PROTECTED] quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________________________ ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ The BASE general discussion mailing list basedb-users@lists.sourceforge.net unsubscribe: send a mail with subject "unsubscribe" to [EMAIL PROTECTED]