Hi Fico, - you are right, this was a bug (some index was off). I committed a patch for this to SVN. - I also added new behaviour for downloading chem comp files: The default chem comp provider will fetch the components.cif.gz file and extract all definitions into small files, which will be used from then on. - not sure about your last question. That is kind of already possible I believe. You can use the getChemComp method to get the exact definition for a group.
Andreas On Sun, Dec 19, 2010 at 9:31 PM, Fico <[email protected]> wrote: > now the question of ChemComp download is OK, but I found a new question when > I test bioJava3-Beta4, my program fragment: > > FileParsingParameters params = new FileParsingParameters(); > params.setLoadChemCompInfo(false); > params.setHeaderOnly(false); > // params.setParseCAOnly(true); > params.setAlignSeqRes(true); > params.setParseSecStruc(false); > > // loop file > for (String file : getPdbFiles()) { > > PDBFileReader pdbreader = new PDBFileReader(); > pdbreader.setAutoFetch(false); > pdbreader.setPath(getPdbDir()); > > pdbreader.setFileParsingParameters(params); > > // pdbreader.setLoadChemCompInfo(true); > Structure struc = null; > try { > struc = pdbreader.getStructure(getPdbDir() + "\\" + file); > } catch (IOException e) { > e.printStackTrace(); > } > > String pdbid = struc.getPDBCode(); > > for (int i = 0; i < struc.nrModels(); i++) { > > // loop chain > for (Chain ch : struc.getModel(i)) { > System.out.println(pdbid + ">>>" + ch.getChainID() + > ">>>" > + ch.getAtomSequence()); > System.out.println(pdbid + ">>>" + ch.getChainID() + > ">>>" > + ch.getSeqResSequence()); > // Test the getAtomGroups() and getSeqResGroups() method > // List<Group> group = ch.getAtomGroups(); > List<Group> group = ch.getSeqResGroups(); > for (Group gp : group) { > System.out.println(gp.getResidueNumber() + ":" > + gp.getPDBName()); > } > } > } > } > > my test PDB file is 1O1G.pdb, there are 45 modified residues in chain A, > when I use .getAtomGroups() I can get all residues' atom information, such > as ResidueNumber and PDBName: > 797:PHE > 798:LEU > 799:MET > 800:ARG > 801:VAL > 802:GLU > ...... > 840:PRO > 841:LEU > 842:LEU > 843:LYS > > but use .getSeqResGroups(), the last 45 residues will miss some information, > such as ResidueNumber and atom coordinate, the output of the program is: > 797:PHE > 798:LEU > null:MET > null:ARG > null:VAL > null:GLU > ...... > null:PRO > null:LEU > null:LEU > null:LYS > > In biojava3-Beta1 the two method produce same result just as > .getAtomGroups() in Beta4. so is it a bug? > > P.S. > Could we add new method to get all amino acid sequence with modifed > residues directly? now both getAtomSequence() and getSeqResSequence() can't > do this, if I want get the amino acid sequence with modifed residues, I had > to use .getAtomGroups() or .getSeqResGroups() first and then loop each > residue to get one letter amino acid sequence. > > > > > > 2010/12/17 Andreas Prlic <[email protected]> >> >> ok that behavior is fixed in SVN now. Now you can have setAlignSeqRes >> set to true and it will not download chemical components if >> loadChemComp is false. The drawback is that the data representation >> will not be as precise. >> >> Andreas >> >> >> >> On Thu, Dec 16, 2010 at 8:26 AM, Steve Darnell <[email protected]> >> wrote: >> > The SeqRes to Atom record alignment forces the use of chemical >> > components to translate non-standard residues to their closest standard >> > counterpart for the sequence alignment. I have to disable >> > setLoadChemCompInfo and setAlignSeqRes when I don't want to download >> > chemical component files from RCSB when parsing a PDB file. >> > >> > Regards, >> > Steve >> > >> > -----Original Message----- >> > From: [email protected] >> > [mailto:[email protected]] On Behalf Of Fico >> > Sent: Wednesday, December 15, 2010 8:46 PM >> > To: [email protected] >> > Subject: [Biojava-l] how to cancel download chemcomp when parser a PDB >> > file >> > >> > Hi, dear all: >> > >> > I use biojava3 beta1 to parse the PDB files recently, my program is: >> > >> > PDBFileReader pdbreader = new PDBFileReader(); >> > pdbreader.setAutoFetch(false); >> > pdbreader.setPath(pdbDirPath); >> > >> > FileParsingParameters params = new FileParsingParameters(); >> > params.setLoadChemCompInfo(*false*); >> > params.setHeaderOnly(*false*); >> > params.setAlignSeqRes(*true*); >> > params.setParseSecStruc(*false*); >> > pdbreader.setFileParsingParameters(params); >> > >> > Structure structure = null; >> > try { >> > structure = pdbreader.getStructure(pdbDirPath + "\\" + >> > file); >> > } catch (IOException e) { >> > e.printStackTrace(); >> > } >> > >> > when I execute this program, it will download something such as: >> > >> > *creating directory D:\MyWorkspace\TestFiles\pdbFiles\chemcomp >> > downloading http://www.rcsb.org/pdb/files/ligand/35G.cif >> > downloading http://www.rcsb.org/pdb/files/ligand/GDP.cif* >> > >> > but I do not want to lownload those stuff, How can I cancel it? >> > Thanks. >> > _______________________________________________ >> > Biojava-l mailing list - [email protected] >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > >> > _______________________________________________ >> > Biojava-l mailing list - [email protected] >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > >> >> >> >> -- >> ----------------------------------------------------------------------- >> Dr. Andreas Prlic >> Senior Scientist, RCSB PDB Protein Data Bank >> University of California, San Diego >> (+1) 858.246.0526 >> ----------------------------------------------------------------------- > > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
