Thank you for your answers. I've looked through sources of forester and found some parsers (org/forester/io/parsers/ nexus). On first sight it could be what you talked about but I'm not sure.
Cheers, Pola 2014-11-06 13:49 GMT+01:00 Ben Stöver <[email protected]>: > > > Spencer Bliven schrieb am 2014-11-06: > > Ben, > > > This sounds like a great idea and a really useful addition to > > biojava! I > > would lean towards only parsing the consensus tree, as the other > > formats > > are pretty specific use cases. We're sure forester doesn't provide > > Nexus > > parsing, right? The documentation isn't particularly complete, but > > it's > > already a phylo dependency so we should avoid duplicating any > > features. > > No, I'm personally not 100 % sure if any Nexus features are implemented in > forester, but I thought they are not, because otherwise there would have > been > no Nexus parsing system in BioJava 1.x? > > > > As to your second suggestion, it sounds very similar to how > > FastaReader > > currently works, with the user providing a SequenceCreator which > > instantiates whatever Sequence implementation you want to use. > > Mutable > > sequences can lead to a host of additional problems, which is why the > > sequences are currently generated atomically. Or am I > > misunderstanding your > > suggestion? > > I just looked at the code > ( > https://github.com/biojava/biojava/blob/master/biojava3-core/src/main/java/org/biojava3/core/sequence/io/FastaReader.java > ) and SequenceCreator does not do exactly what I meant, since in the > process() > method of FastaReader, the whole sequence is first loaded into a > StringBuilder > and afterwards passed to sequenceCreator, which means there is no > compression > during loading. So SequenceCreator does a part of what I was thinking of, > but > it would not work for very large sequences. (Although I don't find it now, > I > think I read a similar statement somewhere in the JavaDocs of the > compresses > Sequence implementation.) > > The main benefits I still see for the idea, would first be the abstract > strategy pattern for alignment parsers which would allow to write code > independent of the used format (which is not possible e.g. with the current > FASTA reader) and second editable sequences would of course be usable in > use > cases you cannot really solve with the current sequence model (e.g. using > it > as the data backend for an alignment editor or GUI components I have in > LibrAlign). > > I'm not sure which problems you mean which would arise from having mutable > sequences (remember: the idea was not to replace current implementations of > the Sequence interface, but to add additional mutable versions). Mayby you > could give same examples? (Are thinking about the need for change listers > or > similar things?) > > Anyway it was only an idea for discussion, I'm really not saying that we > definitely need to go in that direction. (For my own projects I already > have a > mutable sequence model with bridges to the current BioJava model, so I > would > be fine there.) Maybe there are really problems comming with this idea I > currently do not see? In that case we could of course also think about just > adding a interface for sequence parsers, that allows to use them in an > abstract strategy pattern. (That would than really be a slight API change, > if > the existing readers and writers would implement such an interface, but it > might be possible, when there is anyway a version 4 comming?) > > Best > Ben > > > > It would be fantastic to have some additional development of multiple > > alignments and the phylo package! Thanks for the offer to contribute! > > > -Spencer > > > On Thu, Nov 6, 2014 at 12:19 PM, Jose Manuel Duarte > > <[email protected]> > > wrote: > > > > Hi Ben > > > > Thanks a lot for all the insights. I am really not the most > > > appropriate > > > person to comment on all the biojava phylogeny and sequence related > > > things > > > but anyway below are some of my opinions. > > > > > On 05/11/14 17:22, Ben Stöver wrote: > > > > > >> The more interesting/urgent thing though might be parsing the > > >> consensus > > >> tree > > >> which is in Nexus format (or writing the input files for MrBayes). > > >> Although > > >> the Nexus format is not really state of the art anymore and > > >> replacements > > >> like > > >> e.g. NeXML (http://nexml.org/ ) - which overcome its limitations > > >> - > > >> should be > > >> prefered if you implement a new software, the Nexus format is > > >> still widely > > >> used and supporting in BioJava 3 (or 4) would surely be a good > > >> idea. > > >> There was > > >> a extensible Nexus parser in BioJava 1.x > > >> (http://www.biojava.org/docs/api1.9.1/org/biojavax/bio/ > > >> phylo/io/nexus/package-summary.html > > >> ) which could be ported to BioJava 3 (4). (This has never been > > >> done until > > >> now, > > >> hasen't it?) > > > > > If I understand it properly they were not ported yet to 3 because > > > of lack > > > of time, so I think the porting of the nexus stuff would be a great > > > thing. > > > +1 to that. > > > > > >> Therefore I would offer to implement such functionality for > > >> BioJava, but > > >> before making a pull request or anything, I wanted to ask for > > >> opinion of > > >> the > > >> cummunity on that idea and also if I might have missed concepts in > > >> BioJava > > >> that would currently already allow to do something similar. > > > > > To me the whole idea sounds great. Especially if it can be made > > > compatible > > > with the existing Biojava interfaces. If I understand what you > > > propose, you > > > would only introduce a new way of parsing things which could even > > > live > > > alongside the current parsers. It could even go to its own package > > > (sequence.nio ?). For me this is a +1 too. > > > > Cheers > > > > Jose > > > > _______________________________________________ > > > Biojava-l mailing list - [email protected] > > > http://mailman.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - [email protected] > http://mailman.open-bio.org/mailman/listinfo/biojava-l >
_______________________________________________ Biojava-l mailing list - [email protected] http://mailman.open-bio.org/mailman/listinfo/biojava-l
