Davide Cittaro wrote: > Actually I was having issues of reading stdin, but I found that was not > related to wigToBigWig. > I'm interested in memory issues, of course, so please keep us updated :-)
Good Afternoon Davide: I tried a couple of different encodings. The memory consumed depends upon the type of input. I worked with the phyloP data for the 46-way vertebrate track on hg19, which is a data set that covers 2,845,303,719 bases of hg19. A worst case is a variableStep wiggle file, where the coordinates specified happen to be consecutive. Normally the best encoding for this would be fixedStep. This phyloP data set, when used in its original fixedStep ascii encoding, consumes 32 Gb of memory with wigToBigWig in 35 minutes of running time. When that data is in variableStep format, the wigToBigWig consumes 60 Gb of memory in 2 hours 20 minutes run time. When that data is in bedGraph format, the bedGraphToBigWig converter consumes 3 Gb of memory for 1 hour 40 minutes run time. As an aside, using that bedGraph file as an ordinary bed file, the bedToBigBed converter consumes 19 Gb of memory in 1 hour 15 minutes run time to produce a big bed file. --Hiram File sizes, input files: hg19.phyloP.wig.fixedStep.txt - 17 Gb hg19.phyloP.wig.variableStep.txt - 42 Gb hg19.phyloP.bedGraph - 71 Gb Resulting converted files: hg19.phyloP.from.fixedStep.bw - 8.2 Gb hg19.phyloP.from.variableStep.bw - 14 Gb hg19.phyloP.from.bedGraph.bw - 15 Gb hg19.phyloP.from.bedGraph.bb - 14 Gb _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
