Hi, Asa.

The "Too many open files" issue is covered in the FAQ:

http://sourceforge.net/p/picard/wiki/Main_Page/#q-im-getting-javaiofilenotfoundexception-too-many-open-files-what-should-i-do

It provides several different options for addressing this.

Kathleen

On Mon, Sep 8, 2014 at 3:26 AM, Asa Perez-Bercoff <
[email protected]> wrote:

>  Hi there,
>
>  I’ve been trying to use Picard MarkDuplicates to mark and remove
> duplicate entries in some BAM files of mine. (The BAM files are sorted and
> indexed.)
>
>  I’ve the BAM files from 4 fungal strains, and for one of them
> MarkDuplicates works just fine. Have run this multiple times — always with
> the same results. That is, it always works for one fungal strain (always
> the same one), but always fails for the other 3 fungal strains.
>
>  For the 3 fungal strains where it fails I seem to run out of memory.
> I’ve tried fixing the problem by increasing the requested RAM (once I even
> asked for a ridiculous amount of RAM), and specifying a temporary directory
> for Java IO with –DJava.io.tmpdir, as recommended in your mailing list, but
> alas the problem remains.
>
>  I’ve always run with the latest version of Picard. Thus, recently
> updated it to your latest version, 1.119, and although the error message
> has changed slightly the problem of running out of memory still remains.
>
>  My Java version:
>  java version "1.7.0_51"
> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>
>
>  Below is the command I ran, and the error message output.
>
>   java -Xmx22g -Djava.io.tmpdir=$HOME/tempdir -jar
> $HOME/bio/picard-tools-1.119/MarkDuplicates.jar \
>         I =/my/path/my_fungal_strain.srt.bam \
>         O=/my/path/my_fungal_strain.srt.dedup.bam \
>         METRICS_FILE=/my/path/my_fungal_strain.srt.dedup.duplicationMetrics
> \
>         CREATE_INDEX=true \
>         REMOVE_DUPLICATES=true \
>         ASSUME_SORTED=true
>
>  picard.sam.MarkDuplicates INPUT=[/my/path/my_fungal_strain.srt.bam]
> OUTPUT=/my/path/my_fungal_strain.srt.dedup.bam METRICS_FILE=/my/path/my_
> fungal_strain.srt.dedup.duplicationMetrics REMOVE_DUPLICATES=true
> ASSUME_SORTED=true CREATE_INDEX=true    PROGRAM_RECORD_ID=MarkDuplicates
> PROGRAM_GROUP_NAME=MarkDuplicates
> MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000
> MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25
> READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).*
> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false
> VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
> CREATE_MD5_FILE=false
> Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library
> /path/to/my/bin/picard-tools-1.119/libIntelDeflater.so which might have
> disabled stack guard. The VM will try to fix the stack guard now.
> It's highly recommended that you fix the library with 'execstack -c
> <libfile>', or link it with '-z noexecstack'.
> [Fri Sep 05 19:52:55 EST 2014] Executing as [email protected] on
> Linux 2.6.32-431.11.2.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM
> 1.7.0_51-b13; Picard version:
> 1.119(d44cdb51745f5e8075c826430a39d8a61f1dd832_1408991805) IntelDeflater
> INFO    2014-09-05 19:52:55     MarkDuplicates  Start of doWork
> freeMemory: 754742648; totalMemory: 759693312; maxMemory: 20997734400
> INFO    2014-09-05 19:52:55     MarkDuplicates  Reading input file and
> constructing read end information.
> INFO    2014-09-05 19:52:55     MarkDuplicates  Will retain up to 83324342
> data points before spilling to disk.
> INFO    2014-09-05 19:53:20     MarkDuplicates  Read     1,000,000
> records.  Elapsed time: 00:00:24s.  Time for last 1,000,000:   24s.  Last
> read position: NODE_7_length_125489_cov_34.5327_ID_35439289:42,886
> INFO    2014-09-05 19:53:20     MarkDuplicates  Tracking 135538 as yet
> unmatched pairs. 292 records in RAM.
> [Fri Sep 05 19:53:32 EST 2014] picard.sam.MarkDuplicates done. Elapsed
> time: 0.63 minutes.
> Runtime.totalMemory()=2518155264
> To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
> Exception in thread "main" htsjdk.samtools.SAMException:
> /my/home/tempdir/user/CSPI.2431821386522683006.tmp/5438.tmpnot found
>         at
> htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:63)
>         at
> htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:49)
>         at
> htsjdk.samtools.util.ResourceLimitedMap.get(ResourceLimitedMap.java:76)
>         at
> htsjdk.samtools.CoordinateSortedPairInfoMap.getOutputStreamForSequence(CoordinateSortedPairInfoMap.java:180)
>         at
> htsjdk.samtools.CoordinateSortedPairInfoMap.put(CoordinateSortedPairInfoMap.java:164)
>         at picard.sam.DiskReadEndsMap.put(DiskReadEndsMap.java:67)
>         at
> picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:449)
>         at picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:177)
>         at
> picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
>         at picard.sam.MarkDuplicates.main(MarkDuplicates.java:161)
> Caused by: java.io.FileNotFoundException:
> /my/home/tempdir/user/CSPI.2431821386522683006.tmp/5438.tmp (Too many open
> files)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>         at
> htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:60)
>         ... 9 more
>
>  One of the problems seem to be that Picard MarkDuplicates can’t write to
> the temporary directory at /my/home/tempdir/user/
>  I’ve set that directory so that anyone can write to it, but this doesn’t
> seem to solve the problem, so the temporary file
> /my/home/tempdir/user/CSPI.2431821386522683006.tmp/5438.tmp isn’t written
> out.
>
>  Hoping you can help me shed light on this, so that I can make your tool
> run to completion using my data.
> (For all 4 strains it works fine running samtools rmdup, but for
> consistency I’d very much like to make Picard MarkDuplicates work for all 4
> strains.)
>
>  Many thanks in advance,
>  Åsa
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce
> Perforce version control. Predictably reliable.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
> _______________________________________________
> Samtools-help mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
>
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to