Based on the error stack you previously gave, it seems it is a data problem.
Seeing as you've already got the file downloaded, could you filter out all the 'dbxref' lines using grep or something similar, then construct a regex to see if any of them do not match the 'dbxref="something:accession"' pattern? I suspect you'll find at least one, in which case it is indeed a data problem and needs to be addressed to the Ensembl helpdesk so they can correct their dump files. cheers, Richard 2008/11/15 pprun <[EMAIL PROTECTED]>: > It is too large to send it by email, the exactly file name is: > Homo_sapiens.0.dat > The exception took place when first X Chr. sequence was encountered. > > Hope this can help a bit. > - Pprun > > > Richard Holland 写道: >> >> There are many files on that site. I need to know which specific one >> you are working with so that I can also attempt to parse it with some >> debugging options turned on. >> >> Could you attach the file you are using to an email if possible? >> >> cheers, >> Richard >> >> >> 2008/11/15 pprun <[EMAIL PROTECTED]>: >> >>> >>> Hi Richard, >>> Did the original file you mean is the ensembl genbank file? >>> If so, you can get it from ensembl website >>> ftp://ftp.ensembl.org/pub/current_genbank/homo_sapiens/ >>> >>> >>> >>> Richard Holland 写道: >>> >>> This exception occurs when the Genbank file contains a db_xref entry >>> that does not follow the format "Type:Accession". >>> >>> It's hard to tell if this is the problem here without seeing the original >>> file. >>> >>> cheers, >>> Richard >>> >>> 2008/11/15 pprun <[EMAIL PROTECTED]>: >>> >>> >>> Environments: >>> ------------- >>> Biojava: 1.6 >>> Java: 1.6.0_10; Java HotSpot(TM) Client VM 11.0-b15 >>> System: Linux version 2.6.24-21-generic running on i386; UTF-8; zh_CN >>> >>> >>> The detail: >>> -------------- >>> Format_object=org.biojavax.bio.seq.io.GenbankFormat >>> Accession=chromosome:NCBI36:X:101815144:102815143:1 >>> Id=null >>> Comments=Bad dbxref >>> Parse_block=FEATURES Location/Qualifierssource 1..1000000/organism >>> "Homo sapiens"/db_xref "taxon:9606"gene complement(5148..5254)/gene >>> ENSG00000193147/locus_tag "AL035427.17"misc_RNA >>> complement(5148..5254)/gene "ENSG00000193147"/db_xref >>> "Clone_based_ensembl_transcri:AL035427.17-201"/db_xref > ... > > [Message clipped] -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: [EMAIL PROTECTED] http://www.eaglegenomics.com/ _______________________________________________ Biojava-l mailing list - Biojava-l@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l