There's also a good deal of alignment quality checking, thresholding, and 
scoring the overlapping region that is both necessary and but maybe not all 
that straightforward.  I suggest that you check out the PANDAseq paper, which 
describes their algorithm.

http://dx.doi.org/10.1186/1471-2105-13-31

Andreas is correct - the basic building blocks are already there.

Chris

On Apr 24, 2013, at 10:48 AM, Andreas Prlic <andr...@sdsc.edu> wrote:

> It sounds like as all you need is to get the reverse complement of one of 
> your sequences and then you do a local alignment. Both should be possible 
> with BioJava...
> 
> Andreas
> 
> 
> On Wed, Apr 24, 2013 at 7:29 AM, Khalil El Mazouari 
> <khalil.elmazou...@gmail.com> wrote:
> Hi Chris,
> 
> my application is deployed as war file. I am trying to avoid, as much as 
> possible, to shell out to other none java programs... for maintainability 
> reasons.
> 
> I don't think I need a 'full' genome assembly tools (eg velvet ...), it's 
> overkill for my case: cloned gene is sequenced on both directions. Normally 
> one strand is sufficient. If the sequence quality is not good enough, the 2 
> strands are used to get the full length gene. There is always a large overlap 
> between the 2 strand sequence.
> I can QC the full length gene.
> 
> Best
> 
> khalil
> 
> 
> 
> 
> 
> 
> 
> -----
> 
> Confidentiality Notice: This e-mail and any files transmitted with it are 
> private and confidential and are solely for the use of the addressee. It may 
> contain material which is legally privileged. If you are not the addressee or 
> the person responsible for delivering to the addressee, please notify that 
> you have received this e-mail in error and that any use of it is strictly 
> prohibited. It would be helpful if you could notify the author by replying to 
> it.
> 
> 
> 
> On 24 Apr 2013, at 16:04, Chris Friedline wrote:
> 
> > Khalil,
> >
> > Why not just shell out to programs designed for this purpose and pull in 
> > the results?  We are in the process of publishing a paper which uses 
> > PANDAseq to assemble overlapping PE reads.  The latest version of mothur 
> > also does this.
> >
> > www.mothur.org
> > https://github.com/neufeld/pandaseq/wiki/PANDAseq-Assembler
> >
> > PANDAseq is particularly nice in this case, because you could read right 
> > from stderr and stdout streams.  It's also wicked fast.
> >
> > Chris
> >
> > On Apr 24, 2013, at 4:08 AM, Khalil El Mazouari 
> > <khalil.elmazou...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> It's not a global sequence alignment nor genome assembly. It's just a DNA 
> >> fragment sequenced from both ends with an overlapping region. I want to 
> >> assemble the 2 reads in order to get the full length sequence. This 
> >> assembly is a part of a complex analysis process that uses biojava.
> >> I agree, there a lot of simple option how to achieve this. But I need 
> >> somthing in java/biojava.
> >>
> >> Best
> >>
> >> khalil
> >>
> >>
> >>
> >>
> >> -----
> >>
> >> Confidentiality Notice: This e-mail and any files transmitted with it are 
> >> private and confidential and are solely for the use of the addressee. It 
> >> may contain material which is legally privileged. If you are not the 
> >> addressee or the person responsible for delivering to the addressee, 
> >> please notify that you have received this e-mail in error and that any use 
> >> of it is strictly prohibited. It would be helpful if you could notify the 
> >> author by replying to it.
> >>
> >>
> >>
> >> On 23 Apr 2013, at 23:38, Spencer Bliven wrote:
> >>
> >>> If you just have two contiguous sequences to align, you should just use a 
> >>> global sequence alignment. See 
> >>> http://biojava.org/wiki/BioJava:CookBook3:PSA for how to do this in 
> >>> BioJava, or it might be easier to just use one of the online services for 
> >>> this such as http://www.ebi.ac.uk/Tools/psa/.
> >>>
> >>> On the other hand, if you actually want to do genome assembly (ie from 
> >>> many overlapping reads), then there are much more computationally 
> >>> efficient methods. BioJava isn't really intended for large-scale genome 
> >>> assembly, so you'd want to use a sequence assembly tool (eg Velvet).
> >>>
> >>> -Spencer
> >>>
> >>>
> >>> On Tue, Apr 23, 2013 at 12:38 PM, Khalil El Mazouari 
> >>> <khalil.elmazou...@gmail.com> wrote:
> >>> Hi,
> >>>
> >>> I would like to assemble 2 overlapping DNA sequences. Is there something 
> >>> in biojava that may help in this task?
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>>
> >>> -----
> >>>
> >>> Confidentiality Notice: This e-mail and any files transmitted with it are 
> >>> private and confidential and are solely for the use of the addressee. It 
> >>> may contain material which is legally privileged. If you are not the 
> >>> addressee or the person responsible for delivering to the addressee, 
> >>> please notify that you have received this e-mail in error and that any 
> >>> use of it is strictly prohibited. It would be helpful if you could notify 
> >>> the author by replying to it.
> >>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Biojava-l mailing list  -  Biojava-l@lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/biojava-l
> >>>
> >>
> >>
> >> _______________________________________________
> >> Biojava-l mailing list  -  Biojava-l@lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
> >
> >
> >
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 
> 
> 



_______________________________________________
Biojava-l mailing list  -  Biojava-l@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to