Hi Henrik, Lowering memory helped - it's drinks on me when we meet.
It has been running for approximately 7 days now ("ETA for unit type 'expression': 20140320 23:20:26"). With the updated affxparser can I speed it up by increasing the memory burden? The current memory allocation is approximately 3 Gbytes. I don't want to cancel the run if it means loosing the progress though. Best, Damian On Tuesday, March 4, 2014 8:32:17 PM UTC-5, Henrik Bengtsson wrote: > > Did lowering "memory/ram" solve your problem? > > Also, an updated version of affxparser that no longer should overflow > by the integer multiplication is available (on Bioconductor). > > Cheers, > > Henrik > > On Thu, Feb 27, 2014 at 12:36 PM, Henrik Bengtsson > <henrik.b...@aroma-project.org <javascript:>> wrote: > > Congratulations Damian, > > > > I think your the first one to hit a limit of the Aroma Framework > > (remind me to by you a drink whenever you see me in person). > > > > I narrowed it down to the affxparser(*) package and I'll investigate > > further on how to fix this. It should not occur and I'm confident > > that it can be avoided internally. In the meanwhile, try to lower > > your 'memory/ram' setting, e.g. setOption(aromaSettings, "memory/ram", > > 10.0) or less. I'm not 100% sure it'll help, but if it does, that's a > > good clue (for me) on what's causing it. > > > > /Henrik > > > > DETAILS: The below illustrates the issue in affxparser::readCelUnits(): > > > >> .Machine$integer.max > > [1] 2147483647 > >> nbrOfArrays <- 5622L > >> .Machine$integer.max / nbrOfArrays > > [1] 381978.6 > >> nbrOfCells <- 381978L > >> nbrOfCells * nbrOfArrays > > [1] 2147480316 > >> nbrOfCells <- 381979L > >> nbrOfCells * nbrOfArrays > > [1] NA > > Warning message: > > In nbrOfCells * nbrOfArrays : NAs produced by integer overflow > > > > By decreasing 'memory/ram' I *hope* that 'nbrOfCells' effectively > > becomes smaller. > > > > > > On Wed, Feb 26, 2014 at 9:15 PM, Damian Plichta > > <damian....@gmail.com <javascript:>> wrote: > >> Hi Henrik, > >> > >> Thank you, that was helpful. > >> > >> I run to another problem though. I am trying to perform > ExonRmaPlm(csQN, > >> merge=TRUE) but this produces a following error: > >> > >> 20140226 23:25:33| Identifying CDF cell indices...done > >> Error in vector("double", nbrOfCells * nbrOfArrays) : > >> vector size cannot be NA > >> In addition: Warning message: > >> In nbrOfCells * nbrOfArrays : NAs produced by integer overflow > >> 20140226 23:28:35| Reading probe intensities from 5622 > arrays...done > >> 20140226 23:28:35| Fitting chunk #1 of 1 of 'expression' units > (code=1) > >> with various dimensions...done > >> 20140226 23:28:35| Unit dimension #3 (various dimensions) of > 3...done > >> 20140226 23:28:35| Fitting the model by unit dimensions (at least for > the > >> large classes)...done > >> 20140226 23:28:35| Unit type #1 ('expression') of 1...done > >> 20140226 23:28:35| Fitting ExonRmaPlm for each unit type > separately...done > >> 20140226 23:28:35|Fitting model of class ExonRmaPlm...done > >> > >> I testes whether it worked anyway, but the expression is zero across > all > >> arrays when I access it. > >> > >> Do you know what could be causing the problem? > >> > >> Best, > >> Damian > >> > >> > >> The code I run is below: > >> > >> library(aroma.affymetrix) > >> > >> library(aroma.core) > >> > >> setOption(aromaSettings, "memory/ram", 500.0); > >> > >> verbose <- Arguments$getVerbose(-8, timestamp=TRUE) > >> > >> chipType <- "HuEx-1_0-st-v2-core" > >> > >> cdf <- AffymetrixCdfFile$byChipType(chipType) > >> > >> #print(cdf) > >> > >> cs <- AffymetrixCelSet$byName("experiment1", cdf=cdf) > >> > >> bc <- RmaBackgroundCorrection(cs) > >> > >> csBC <- process(bc,verbose=verbose) > >> > >> qn <- QuantileNormalization(csBC, typesToUpdate="pm") > >> > >> target <- getTargetDistribution(qn, verbose=verbose) > >> > >> qn <- QuantileNormalization(csBC, typesToUpdate="pm", > >> targetDistribution=target) > >> > >> csQN <- process(qn, verbose=verbose) > >> > >> csPLM <- ExonRmaPlm(csQN, mergeGroups=TRUE) > >> > >> fit(csPLM, verbose=verbose) > >> > >> date() > >> > >> ces <- getChipEffectSet(csPLM) > >> > >> gExprs <- extractDataFrame(ces, units=1:3, addNames=TRUE) > >> > >> > >>> sessionInfo() > >> R version 3.0.2 (2013-09-25) > >> Platform: x86_64-unknown-linux-gnu (64-bit) > >> > >> locale: > >> [1] LC_CTYPE=C LC_NUMERIC=C > >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > >> [9] LC_ADDRESS=C LC_TELEPHONE=C > >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] preprocessCore_1.23.0 aroma.light_1.31.8 matrixStats_0.8.14 > >> [4] aroma.affymetrix_2.11.1 aroma.core_2.11.0 R.devices_2.8.2 > >> [7] R.filesets_2.3.0 R.utils_1.29.8 R.oo_1.17.0 > >> [10] affxparser_1.34.0 R.methodsS3_1.6.1 > >> > >> loaded via a namespace (and not attached): > >> [1] aroma.apd_0.4.0 base64enc_0.1-1 digest_0.6.4 DNAcopy_1.35.1 > >> [5] PSCBS_0.40.4 R.cache_0.9.2 R.huge_0.6.0 R.rsp_0.9.28 > >> [9] tools_3.0.2 > >> > >> On Thursday, February 20, 2014 1:21:25 PM UTC-5, Henrik Bengtsson > wrote: > >>> > >>> On Tue, Feb 18, 2014 at 7:30 PM, Damian Plichta > >>> <damian....@gmail.com> wrote: > >>> > Thanks, that helped a lot. It took me less than 3 hours to perform > the > >>> > background correction. > >>> > > >>> > Now I'm wondering if for the next step, quantile normalization, I > could > >>> > do a > >>> > similar trick. Is there a way to precompute the target empirical > >>> > distribution based on all arrays and then do the normalization on > chunks > >>> > of > >>> > data (thus in an independent manner)? I can see the option > >>> > targetDistribution under QuantileNormalization. > >>> > >>> # Calculate the target distribution based on *all* arrays [not > >>> parallalized] > >>> qn <- QuantileNormalization(dsC, typesToUpdate="pm") > >>> target <- getTargetDistribution(qn, verbose=verbose) > >>> > >>> # Normalize array by array toward the same target distribution [in > chucks] > >>> dsCs <- extract(dsC, 1:100) > >>> qn <- QuantileNormalization(dsCs, typesToUpdate="pm", > >>> targetDistribution=target) > >>> csNs <- process(qn, verbose=verbose) > >>> > >>> Hope this helps > >>> > >>> /Henrik > >>> > >>> > > >>> > Kind regards, > >>> > > >>> > Damian Plichta > >>> > > >>> > On Monday, February 17, 2014 4:03:54 PM UTC-5, Henrik Bengtsson > wrote: > >>> >> > >>> >> Hi. > >>> >> > >>> >> On Sun, Feb 16, 2014 at 6:53 PM, Damian Plichta > >>> >> <damian....@gmail.com> wrote: > >>> >> > Hi, > >>> >> > > >>> >> > I'm processing around 5500 affymetrix exon arrays. The > >>> >> > RmaBackgroundCorrection() is pretty slow, 1-2 minutes/array. I > played > >>> >> > with > >>> >> > setOption(aromaSettings, "memory/ram", X) and increased X up to > 100 > >>> >> > but > >>> >> > it > >>> >> > didn't have any effect on this stage of analysis. > >>> >> > >>> >> If you don't notice any difference in processing time by changing > >>> >> "memory/ram" from the default (1.0) to 100, then the memory is not > >>> >> your bottleneck. > >>> >> > > >>> >> > Any way to speed the process up? > >>> >> > >>> >> If you haven't already, make sure to read "How to: Improve > processing > >>> >> time": > >>> >> > >>> >> http://aroma-project.org/howtos/ImproveProcessingTime > >>> >> > >>> >> If you have access to multiple machines on the same file system, > you > >>> >> can do poor mans parallel processing for the *background > correction*, > >>> >> because each array is corrected independently of the others. You > can > >>> >> do this by processing a subset of arrays per computer, e.g. > >>> >> > >>> >> dsR <- AffymetrixCelSet$byName("MyDataSet", > chipType="HuEx-1_0-st-v2") > >>> >> dsR <- extract(dsR, 1:100) > >>> >> bg <- RmaBackgroundCorrection(dsS) > >>> >> dsC <- process(bg, verbose=verbose) > >>> >> > >>> >> Repeat on another machine with 101:200, and so on. > >>> >> > >>> >> When all arrays have been background corrected, you can move back > to > >>> >> your original script - all arrays background corrected are already > >>> >> saved to file and will therefore not be redone. > >>> >> > >>> >> /Henrik > >>> >> > >>> >> > > >>> >> > Kind regards, > >>> >> > > >>> >> > Damian Plichta > >>> >> > > >>> >> > -- > >>> >> > -- > >>> >> > When reporting problems on aroma.affymetrix, make sure 1) to run > the > >>> >> > latest > >>> >> > version of the package, 2) to report the output of sessionInfo() > and > >>> >> > traceback(), and 3) to post a complete code example. > >>> >> > > >>> >> > > >>> >> > You received this message because you are subscribed to the > Google > >>> >> > Groups > >>> >> > "aroma.affymetrix" group with website > http://www.aroma-project.org/. > >>> >> > To post to this group, send email to aroma-af...@googlegroups.com > >>> >> > To unsubscribe and other options, go to > >>> >> > http://www.aroma-project.org/forum/ > >>> >> > > >>> >> > --- > >>> >> > You received this message because you are subscribed to the > Google > >>> >> > Groups > >>> >> > "aroma.affymetrix" group. > >>> >> > To unsubscribe from this group and stop receiving emails from it, > >>> >> > send > >>> >> > an > >>> >> > email to aroma-affymetr...@googlegroups.com. > >>> >> > For more options, visit https://groups.google.com/groups/opt_out. > > >>> > > >>> > -- > >>> > -- > >>> > When reporting problems on aroma.affymetrix, make sure 1) to run the > >>> > latest > >>> > version of the package, 2) to report the output of sessionInfo() and > >>> > traceback(), and 3) to post a complete code example. > >>> > > >>> > > >>> > You received this message because you are subscribed to the Google > >>> > Groups > >>> > "aroma.affymetrix" group with website http://www.aroma-project.org/. > > >>> > To post to this group, send email to aroma-af...@googlegroups.com > >>> > To unsubscribe and other options, go to > >>> > http://www.aroma-project.org/forum/ > >>> > > >>> > --- > >>> > You received this message because you are subscribed to the Google > >>> > Groups > >>> > "aroma.affymetrix" group. > >>> > To unsubscribe from this group and stop receiving emails from it, > send > >>> > an > >>> > email to aroma-affymetr...@googlegroups.com. > >>> > For more options, visit https://groups.google.com/groups/opt_out. > >> > >> -- > >> -- > >> When reporting problems on aroma.affymetrix, make sure 1) to run the > latest > >> version of the package, 2) to report the output of sessionInfo() and > >> traceback(), and 3) to post a complete code example. > >> > >> > >> You received this message because you are subscribed to the Google > Groups > >> "aroma.affymetrix" group with website http://www.aroma-project.org/. > >> To post to this group, send email to > >> aroma-af...@googlegroups.com<javascript:> > >> To unsubscribe and other options, go to > http://www.aroma-project.org/forum/ > >> > >> --- > >> You received this message because you are subscribed to the Google > Groups > >> "aroma.affymetrix" group. > >> To unsubscribe from this group and stop receiving emails from it, send > an > >> email to aroma-affymetr...@googlegroups.com <javascript:>. > >> For more options, visit https://groups.google.com/groups/opt_out. > -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.