Dear Peter, >The file that I'm reading contains the upstream regions of the yeast >genome, with each upstream region labeled using a FASTA header, i.e.: > > FASTA header for gene 1 > upstream region..... > ..... > .... > FASTA header for gene 2 > upstream.... > ....
you may want to have a look at the read.fasta() function in the seqinr package. There is an example page 16 of this document: http://pbil.univ-lyon1.fr/software/SeqinR/seqinr_1_0-6.pdf about importing the content of a fasta file with 21,161 sequences from Arabidopsis thaliana into an object which is about 15 Mb in RAM. HTH, -- Jean R. Lobry ([EMAIL PROTECTED]) Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I, 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE allo : +33 472 43 27 56 fax : +33 472 43 13 88 http://pbil.univ-lyon1.fr/members/lobry/ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.