Hi Alec,

Thank you for your helpful suggestions. I will try following your suggestions 
and see what will happen.

Cheers,

Yibo

On Jul 17, 2014, at 11:34, Alec Wysoker <[email protected]> wrote:

> Hi Yibo,
> 
> First of all, FYI this is a stack trace:
> 
> Exception in thread "main" net.sf.samtools.SAMException: Exception when 
> processing alignment for BAM index HS2:410:C21E6ACXX:4:2316:13634:18930 1/2 
> 97b aligned read.
>     at net.sf.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:120)
>     at 
> net.sf.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:168)
>     at 
> net.sf.picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:100)
>     at 
> net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
>     at 
> net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119)
>     at 
> net.sf.picard.sam.AddOrReplaceReadGroups.main(AddOrReplaceReadGroups.java:66)
> Caused by: net.sf.samtools.SAMException: Exception creating BAM index for 
> record HS2:410:C21E6ACXX:4:2316:13634:18930 1/2 97b aligned read.
>     at net.sf.samtools.BAMIndexer.processAlignment(BAMIndexer.java:94)
>     at net.sf.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:117)
>     ... 5 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 32770
>     at 
> net.sf.samtools.BAMIndexer$BAMIndexBuilder.processAlignment(BAMIndexer.java:276)
>     at net.sf.samtools.BAMIndexer.processAlignment(BAMIndexer.java:92)
>     ... 6 more
> 
> Regarding your original question, the problem is that apparently your have a 
> contig that is longer than 512MB.  The BAM index format cannot handle a 
> contig this size.  If you don't need a BAM index for your process, then you 
> can pass CREATE_INDEX=False to all the Picard programs that write a BAM file, 
> and this problem should go away.  If you need a BAM index, because you're 
> going to query by genomic locus, then you've got a problem.
> 
> Regarding using MergeBamAlignment, I believe it should be able to handle the 
> multiple alignments produced by Tophat.  I've never tried it with STAR, but I 
> would suggest you try it without removing multiple mappings, then run 
> ValidateSamFile on the BAM produced, and if that succeeds then you can assume 
> it handles the multiple mappings produced by the aligner correctly.
> 
> -Alec
> 
> On 7/17/14, 2:23 PM, Yibo wrote:
>> Hi Alec,
>> 
>> Thank you very much for your reply and suggestion. I am not sure what the 
>> stack trace is, and I only had the running log reporting the error message 
>> (see attached). Yes, for Tophat2 output, there is an unmapped BAM file, but 
>> for STAR, there is no unmapped BAM file. Before I try to merge them, I have 
>> a question: I am using unique mapping results from the mapping output. So, 
>> should I remove multiple mapping results and then merge with unmapped file, 
>> or not remove and merge? Thanks again.
>> 
>> Yibo
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Jul 17, 2014, at 6:11, Alec Wysoker <[email protected]> wrote:
>> 
>>> Hi Yibo,
>>> 
>>> It would help if you send the stack trace that should be printed along with 
>>> the error.  However, the fact that you need to reduce the validation 
>>> stringency on the various PIcard tools means that you are likely to run 
>>> into various problems.  I suggest you change your workflow as follows: 
>>> rather than using the output of your aligner directly, use Picard 
>>> MergeBamAlignment to combine the aligner output with a valid unmapped BAM.  
>>> This should eliminate all the Picard validation issues.  If you don't have 
>>> an unmapped BAM, you can create one with Picard FastqToSam.
>>> 
>>> -Alec
>>> 
>>> On 7/16/14, 3:18 PM, Yibo wrote:
>>>> Dear all,
>>>> 
>>>> I am using Picard tools but encountered some problems. I want to use GATK 
>>>> to call snp for RNA-seq data. I have used two methods, Tophat2 and STAR, 
>>>> to perform RNA-seq mapping. Because I do not have my own reference genome, 
>>>> I used the genome of a related species as the reference (different genus). 
>>>> However, for the Tophat2 and STAR mapping output (I used only unique 
>>>> mapping output), when using Picardtools (SortSam, AddOrReplaceReadGroups 
>>>> or MarkDuplicates), some errors occurred.
>>>> 
>>>> I am using Java 1.7.0_45 and Picard-tools-1.117 (or 1.77). The error 
>>>> messages are as follows:
>>>> 
>>>> (1) For STAR output, I successfully used AddOrReplaceReadGroups to add 
>>>> read groups and sort by coordinate. However, when using MarkDuplicates, an 
>>>> error occurred as follows (running log attached):
>>>> 
>>>> "Exception in thread "main" htsjdk.samtools.SAMException: Exception when 
>>>> processing alignment for BAM index HS2:410:C21E6ACXX:4:1109:16160:63599 
>>>> 2/2 99b aligned read."
>>>> 
>>>> I checked this read and it is a read pair and seems ok. I attached the 
>>>> paired-end reads information as follows:
>>>> 
>>>> HS2:410:C21E6ACXX:4:1109:16160:63599       163     1       536940431       
>>>> 255     9S90M   =       536940464       99      
>>>> CAGAAAATGAACTGTTTGGAGAAAGATGCCTGCAGAAACCTACATAGCAGCAAGAAGATCCAGAATGAATTTTGGGGTACAATTGATTGAACTGAAGGG
>>>>      
>>>> BBBFFFFFFFFFFIFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFF<BBFFFFFFFFFBFFFFFFFFF
>>>>      RG:Z:LANE11A    NH:i:1  HI:i:1  jI:B:i,-1       jM:B:c,-1       
>>>> nM:i:10 AS:i:134
>>>> HS2:410:C21E6ACXX:4:1109:16160:63599       83      1       536940464       
>>>> 255     66M33S  =       536940431       -99     
>>>> CATAGCAGCAAGAAGATCCAGAATGAATTTTGGGGTACAATTGATTGAACTGAAGGGGGCTGAACACAATTATTTTGAATGTAAACTCTTATGCCAAAG
>>>>      
>>>> FFFFFFFBFFFFFFFFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFIIIIIIIIIIIIIIIIIIFIFFFFFFFFFFBBB
>>>>      RG:Z:LANE11A    NH:i:1  HI:i:1  jI:B:i,-1       jM:B:c,-1       
>>>> nM:i:10 AS:i:134
>>>> 
>>>> Additionally, I used ValidateSamFile tool to check my sam file, and found 
>>>> as follows:
>>>> 
>>>> Error Type      Count
>>>> ERROR:MATE_NOT_FOUND    37
>>>> WARNING:MISSING_TAG_NM  48684343
>>>> 
>>>> Because only 37 reads is mate_not_found, I deleted these reads from my sam 
>>>> file. Then rerunning MarkDuplicates, the same error still occurred.
>>>> 
>>>> My scripts are as follows:
>>>> 
>>>> java -Xmx4G -jar 
>>>> /u/home/y/ybhu/picard-tools-1.117/AddOrReplaceReadGroups.jar 
>>>> INPUT=${path_input}/${samp}_uniquelymapped.sam 
>>>> OUTPUT=${path_input}/${samp}_rg.sam SORT_ORDER=coordinate RGID=LANE11A 
>>>> RGPL=ILLUMINA RGLB=LIB1A RGPU=LANE1 RGSM=1A
>>>> 
>>>> java -Xmx4G -jar /u/home/y/ybhu/picard-tools-1.117/MarkDuplicates.jar 
>>>> I=${path_input}/${samp}_rg.bam O=${path_input}/${samp}_dedup.bam 
>>>> METRICS_FILE=${path_input}/${samp}_dedupmetrics.txt REMOVE_DUPLICATES=TRUE 
>>>> CREATE_INDEX=TRUE ASSUME_SORTED=TRUE VALIDATION_STRINGENCY=SILENT
>>>> 
>>>> 
>>>> 
>>>> (2) For Tophat2 output, I used different steps (first use SortSam to sort 
>>>> by coordinate, then AddOrReplaceReadGroups to add read groups). However, 
>>>> for these two steps, error messages also occurred (running log as 
>>>> attached).
>>>> 
>>>> For SortSam tool, 11449 reads were reported "Insert size out of range". 
>>>> Then I changed the argument "VALIDATION_STRINGENCY=STRICT" to "LENIENT", 
>>>> the SortSam can ignore these errors and finish sorting.
>>>> 
>>>> But when running AddOrReplaceReadGroups, a same error occurred (similar to 
>>>> STAR running error):
>>>> 
>>>> "Exception in thread "main" net.sf.samtools.SAMException: Exception when 
>>>> processing alignment for BAM index HS2:410:C21E6ACXX:4:2316:13634:18930 
>>>> 1/2 97b aligned read."
>>>> 
>>>> I pasted the read information here:
>>>> 
>>>> HS2:410:C21E6ACXX:4:2316:13634:18930       73      1       536905899       
>>>> 50      32M2D57M4I4M    *       0       0       
>>>> GTAAATTTTTGTAAATGTTCAATGAGGTGCTGAAACATGTATGTTCTTTAGCTTTCCCATTTAGAAGACAACATAAATCTTAATTTTTCCACTAATT
>>>>        
>>>> B<BFFFFFFFFFFFF0<BFFIIIF0BFBFFFIIIIIBFIBFFFBFFFIFFBFFFFFFFFFFIIBFFIIIFIIIIIFIFFFFFFFFBFFFBBBFFFBF
>>>>        AS:i:-126       XN:i:0  XM:i:20 XO:i:2  XG:i:6  NM:i:26 
>>>> MD:Z:2C1G10A3T0G2A0T0A3A0A1^AA3T9G4A1A0G2T24T1A0A5G2    YT:Z:UU NH:i:1.
>>>> 
>>>> My scripts are as follows:
>>>> 
>>>> java -Xmx3G -jar /u/local/apps/picard-tools/1.77/SortSam.jar 
>>>> INPUT=${path_input}/${samp}_uniquelymapped.sam 
>>>> OUTPUT=${path_input}/${samp}_unique_sorted.sam SORT_ORDER=coordinate 
>>>> CREATE_INDEX=TRUE VALIDATION_STRINGENCY=LENIENT
>>>> 
>>>> java -Xmx3G -jar 
>>>> /u/local/apps/picard-tools/1.77/AddOrReplaceReadGroups.jar 
>>>> INPUT=${path_input}/${samp}_unique_sorted.bam 
>>>> OUTPUT=${path_input}/${samp}_rg.bam SORT_ORDER=coordinate 
>>>> CREATE_INDEX=TRUE RGID=LANE11A RGPL=ILLUMINA RGLB=LIB1A RGPU=LANE1 RGSM=1A 
>>>> VALIDATION_STRINGENCY=LENIENT
>>>> 
>>>> In summary, the errors for STAR and Tophat2 seem to be caused by creating 
>>>> BAM index for these reported reads. But I do no know why and how to 
>>>> resolve it.
>>>> 
>>>> Your comments are very appreciated, and thanks in advance.
>>>> 
>>>> Yibo
>>>> 
>>>> Department of Ecology and Evolutionary Biology,
>>>> University of California, Los Angeles
>>>> 
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------------
>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>> Code Sight - the same software that powers the world's largest code
>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>> http://p.sf.net/sfu/bds
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Samtools-help mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/samtools-help
>>> 
>> 
> 

------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to