I am a bit confused. Do you have data frames created and bound to the same Python variable names ? Or do you have a set of vectors, and subsets put together in different data frame ? (if you are iterating and building a lot of linear model, you might be trying to do some sort of variable/model selection). In the later case, this is where a bug could lurk, and dangling pointers appear (through the recursive release of elements in the data frame - although I cannot explain why it would randomly have to wait for 10, 100, 1000, or 5000 iterations... which what I am experiencing with rpy2).
Any experiment is good. I would be very happy if we could come up with a way to trigger the bug a reproducible way. 2008/7/16 laurent oget <[EMAIL PROTECTED]>: > I was thinking of another workaround and would love your opinion on that. My > intuition is that the problem occurs because I have variables who are in > several dataframes with the same name. Do you think mangling the names of > the variables before passing them to R might help? Would it be an > interesting experiment? > > Laurent > > 2008/7/16 Laurent Gautier <[EMAIL PROTECTED]>: >> >> I have trouble narrowing down where exactly the problem is occurring >> (it seems to keep moving each time I think that I am getting closer). >> >> In the short term, and for your needs, I guess that it is good if calls to >> gc >> prevents rpy from crashing. >> To save on runtime, you can try a gc pooling strategy (call gc every N >> iterations). >> >> >> >> 2008/7/15 laurent oget <[EMAIL PROTECTED]>: >> > Calling r.gc() before each creation seems to have solved the problem. We >> > ran >> > a whole lot of things over the night without any segmentation fault. >> > This is >> > however pretty expensive timewise. >> > >> > Laurent >> > >> > 2008/7/14 laurent oget <[EMAIL PROTECTED]>: >> >> >> >> I am running things with a call to gc() before each regression, in case >> >> what happens is a race condition where the gc is called in the middle >> >> of the >> >> constrction of a new dataframe... >> >> >> >> Thanks for the prompt help! >> >> >> >> Laurent >> >> >> >> 2008/7/14 Laurent Gautier <[EMAIL PROTECTED]>: >> >>> >> >>> 2008/7/15 laurent oget <[EMAIL PROTECTED]>: >> >>> > can i get a quick hint on how i would go about calling --verbose >> >>> > through RPY >> >>> > ? >> >>> >> >>> I suspect that the only way is to hack >> >>> line 93 of rpymodule.c >> >>> >> >>> char *defaultargv[] = {"rpy", "-q", "--vanilla"}; >> >>> >> >>> and recompile/install rpy. >> >>> >> >>> > I am pretty clueless about the way R does the garbage collection. >> >>> > One >> >>> > thing >> >>> > I know is there are columns that are shared between different linear >> >>> > regressions, so it might be that the garbage collector cleans up a >> >>> > column >> >>> > from iteration n-1 and breaks column n in the process. >> >>> >> >>> >> >>> >> >>> > Is there a way to >> >>> > call the garbage collection explicitely before I create a new >> >>> > dataframe >> >>> > to >> >>> > test this hypothesis? >> >>> >> >>> gc() >> >>> #possibly gc(verbose = TRUE) >> >>> >> >>> >> >>> I have made a toy example to see if I could reproduce it here (with >> >>> rpy2 >> >>> - >> >>> but there are similarities in the way R objects pointed at from Python >> >>> objects >> >>> are protected from R's garbage collection)... and it seem that there >> >>> is something going on with garbage collection (depending on how the >> >>> python implementation, it either crashes after 10-50 iterations... or >> >>> can >> >>> go through 1000 iterations. >> >>> >> >>> >> >>> > Thanks, >> >>> > >> >>> > Laurent >> >>> > >> >>> > 2008/7/14 Laurent Gautier <[EMAIL PROTECTED]>: >> >>> >> >> >>> >> Someone else reported on this list a similar sounding problem not >> >>> >> so >> >>> >> long >> >>> >> ago. >> >>> >> >> >>> >> >> >>> >> The problem might be caused by manipulating a stale pointer to an R >> >>> >> object >> >>> >> (that is an object that was discarded during R's garbage >> >>> >> collection), >> >>> >> and troubleshooting this will likely mean running things through a >> >>> >> C >> >>> >> debugger. >> >>> >> You could try starting up you embedded R process with '--verbose' >> >>> >> and >> >>> >> see >> >>> >> if the problem happens right after garbage collection. >> >>> >> Without having further details on the exact code ran, it is >> >>> >> difficult >> >>> >> to say more. >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> 2008/7/14 laurent oget <[EMAIL PROTECTED]>: >> >>> >> > I am using rpy/R to perform linear regressions on a large number >> >>> >> > of >> >>> >> > datasets, in one python run, and am encountering segmentation >> >>> >> > faults >> >>> >> > after a >> >>> >> > large number of iteration, while handling cases which, taken on >> >>> >> > their >> >>> >> > own >> >>> >> > run without a problem. My intuition is that the previous >> >>> >> > iterations >> >>> >> > somehow >> >>> >> > corrupted the heap (or the stack). The code breaks in different >> >>> >> > places. >> >>> >> > One >> >>> >> > sample stack trace: >> >>> >> > >> >>> >> > >> >>> >> > *** caught segfault *** >> >>> >> > address 0x8, cause 'memory not mapped' >> >>> >> > >> >>> >> > Traceback: >> >>> >> > 1: unique.default(unlist(lapply(answer, length))) >> >>> >> > 2: unique(unlist(lapply(answer, length))) >> >>> >> > 3: sapply(xlev, is.null) >> >>> >> > 4: .getXlevels(mt, mf) >> >>> >> > 5: function (formula, data, subset, weights, na.action, method = >> >>> >> > "qr", >> >>> >> > model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = >> >>> >> > TRUE, >> >>> >> > contrasts = NULL, offset, ...) { ret.x <- x ret.y <- y >> >>> >> > cl >> >>> >> > <- >> >>> >> > match.call() mf <- match.call(expand.dots = FALSE) m <- >> >>> >> > match(c("formula", "data", "subset", "weights", "na.action", >> >>> >> > "offset"), names(mf), 0L) mf <- mf[c(1L, m)] >> >>> >> > mf$drop.unused.levels >> >>> >> > <- >> >>> >> > TRUE mf[[1L]] <- as.name("model.frame") mf <- eval(mf, >> >>> >> > parent.frame()) if (method == "model.frame") >> >>> >> > return(mf) >> >>> >> > else >> >>> >> > if (method != "qr") warning(gettextf("method = '%s' is >> >>> >> > not >> >>> >> > supported. Using 'qr'", method), domain = NA) mt >> >>> >> > <- >> >>> >> > attr(mf, >> >>> >> > "terms") y <- model.response(mf, "numeric") w <- >> >>> >> > as.vector(model.weights(mf)) if (!is.null(w) && >> >>> >> > !is.numeric(w)) >> >>> >> > stop("'weights' must be a numeric vector") offset <- >> >>> >> > as.vector(model.offset(mf)) if (!is.null(offset)) { if >> >>> >> > (length(offset) == 1) offset <- rep(offset, NROW(y)) >> >>> >> > else >> >>> >> > if (length(offset) != NROW(y)) stop(gettextf("number >> >>> >> > of >> >>> >> > offsets >> >>> >> > is %d, should equal %d (number of observations)", >> >>> >> > length(offset), NROW(y)), domain = NA) } if >> >>> >> > (is.empty.model(mt)) >> >>> >> > { x <- NULL z <- list(coefficients = if >> >>> >> > (is.matrix(y)) >> >>> >> > matrix(, 0, 3) else numeric(0), residuals = y, >> >>> >> > fitted.values >> >>> >> > = 0 >> >>> >> > * y, weights = w, rank = 0L, df.residual = if >> >>> >> > (is.matrix(y)) >> >>> >> > nrow(y) else length(y)) if (!is.null(offset)) { >> >>> >> > z$fitted.values <- offset z$residuals <- y - offset >> >>> >> > } >> >>> >> > } else { x <- model.matrix(mt, mf, contrasts) z >> >>> >> > <- >> >>> >> > if >> >>> >> > (is.null(w)) lm.fit(x, y, offset = offset, >> >>> >> > singular.ok = >> >>> >> > singular.ok, ...) else lm.wfit(x, y, w, >> >>> >> > offset = >> >>> >> > offset, singular.ok = singular.ok, ...) } >> >>> >> > class(z) >> >>> >> > <- >> >>> >> > c(if >> >>> >> > (is.matrix(y)) "mlm", "lm") z$na.action <- attr(mf, >> >>> >> > "na.action") >> >>> >> > z$offset <- offset z$contrasts <- attr(x, "contrasts") >> >>> >> > z$xlevels >> >>> >> > <- >> >>> >> > .getXlevels(mt, mf) z$call <- cl z$terms <- mt if >> >>> >> > (model) >> >>> >> > z$model <- mf if (ret.x) z$x <- x if (ret.y) >> >>> >> > z$y >> >>> >> > <- >> >>> >> > y if (!qr) z$qr <- NULL >> >>> >> > z}("sls_1133745~dp_1133745+dp_3124901", list(dp_3124901 = c(0, 0, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > -0.328504066972036, 0, 0, 0, 0, 0, 0, 0, -0.328504066972036, 0, >> >>> >> > 0, >> >>> >> > 0, 0, >> >>> >> > 0, >> >>> >> > -0.328504066972036, -0.328504066972036, 0, 0, 0, 0, 0, 0, 0, 0, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > -0.693147180559945, 0, -0.693147180559945, 0, 0, 0, >> >>> >> > -0.328504066972036, >> >>> >> > 0, >> >>> >> > 0, 0, 0, 0, 0, 0, 0, 0, -0.328504066972036, -0.328504066972036, >> >>> >> > 0, >> >>> >> > 0, 0, >> >>> >> > 0, >> >>> >> > 0, -0.693147180559945, 0, 0, 0, 0, -0.693147180559945, 0, >> >>> >> > -0.693147180559945, -0.693147180559945, 0, 0, 0, 0, 0, 0, 0, >> >>> >> > -0.248461359298500, 0, -0.248461359298500, 0, 0, 0, 0, 0, 0, 0, >> >>> >> > 0, >> >>> >> > 0, 0, >> >>> >> > 0, >> >>> >> > -0.693147180559945, 0, 0, 0, -0.248461359298500, 0, 0, >> >>> >> > -0.248461359298500, >> >>> >> > 0, 0, -0.693147180559945, 0, 0, 0, 0, 0, 0), sls_1133745 = >> >>> >> > c(7.3487612610094, 7.30613534736707, 8.00046685932928, >> >>> >> > 7.39499697533915, >> >>> >> > 7.34321275228777, 7.88963522981753, 7.22973732372956, >> >>> >> > 8.18020882367827, >> >>> >> > 7.91558910409137, 7.26812538738378, 7.25818677308703, >> >>> >> > 7.3498801314529, >> >>> >> > 7.31767047076391, 7.65018297965786, 7.90898624450853, >> >>> >> > 8.19434808229588, >> >>> >> > 8.21457885218454, 7.3641669547084, 7.35953128995355, >> >>> >> > 7.7040815977251, >> >>> >> > 7.2031227430099, 7.31216649861154, 7.19473710631952, >> >>> >> > 7.29547045158008, >> >>> >> > 7.35154326140946, 7.17917838793797, 8.36984796851258, >> >>> >> > 7.1754285028756, >> >>> >> > 8.12093952739025, 7.687240693355, 7.2225733180656, >> >>> >> > 7.36405924490386, >> >>> >> > 7.3452615426351, 7.25804588634565, 7.23995411189234, >> >>> >> > 7.21269261931628, >> >>> >> > 7.3210695020337, 7.8386981421886, 7.373405696737, >> >>> >> > 7.17968137538913, >> >>> >> > 7.82070844738252, 7.43464714668485, 7.25244367037469, >> >>> >> > 7.23292820385588, >> >>> >> > 7.33945977029367, 7.2810277192681, 7.34753163260778, >> >>> >> > 7.94492823905035, >> >>> >> > 7.87310661464411, 7.14571650431343, 7.1568316491227, >> >>> >> > 8.04067559226017, >> >>> >> > 7.16502195742237, 7.27351582356697, 7.28802564845575, >> >>> >> > 8.77370188389706, >> >>> >> > 7.3224047480958, 6.4802899305943, 7.2810277192681, >> >>> >> > 7.26784638120198, >> >>> >> > 7.35638204246829, 7.20306320059336, 7.15358357480953, >> >>> >> > 8.01940553608164, >> >>> >> > 8.78195611526034, 7.24723014605046, 7.319056656224, >> >>> >> > 7.24884540813453, >> >>> >> > 7.25487774291456, 7.211054762329, 8.21131456124779, >> >>> >> > 7.19920773459773, >> >>> >> > 7.9786879985435, 7.28814186671376, 7.27840117311202, >> >>> >> > 7.2972806890663, >> >>> >> > 8.14356390282749, 7.34424090367824, 7.32067255061534, >> >>> >> > 7.34136070803461, >> >>> >> > 7.76719625525884, 7.20539015753155, 7.19294925869672, >> >>> >> > 8.12521670148863, >> >>> >> > 7.3678032648606, 7.32193563210652, 7.16022459567817, >> >>> >> > 7.34345210203388, >> >>> >> > 8.11895788519276, 7.81886461055585, 7.76421532379024, >> >>> >> > 7.93634917049156, >> >>> >> > 7.20526391225165, 7.2842720651344, 7.28607526956675, >> >>> >> > 7.29555188093676, >> >>> >> > 7.19784694602772, 7.18371895847972, 7.21964936074816, >> >>> >> > 7.87895077602558, >> >>> >> > 7.26955407354936), dp_1133745 = c(-0.223143551314210, 0, >> >>> >> > -0.287682072451781, >> >>> >> > 0, 0, -0.116533816255952, 0, -0.223143551314210, >> >>> >> > -0.116533816255952, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > 0, 0, -0.116533816255952, -0.116533816255952, -0.223143551314210, >> >>> >> > -0.287682072451781, 0, 0, -0.116533816255952, 0, 0, 0, 0, 0, 0, >> >>> >> > -0.287682072451781, 0, -0.287682072451781, -0.116533816255952, 0, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > 0, 0, 0, -0.116533816255952, 0, 0, -0.248461359298500, 0, 0, 0, >> >>> >> > 0, >> >>> >> > 0, 0, >> >>> >> > -0.116533816255952, -0.116533816255952, 0, 0, -0.287682072451781, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > 0, >> >>> >> > -0.400477566597125, 0, 0, 0, 0, 0, 0, 0, -0.248461359298500, >> >>> >> > -0.400477566597125, 0, 0, 0, 0, 0, -0.287682072451781, 0, >> >>> >> > -0.287682072451781, 0, 0, 0, -0.223143551314210, 0, 0, 0, >> >>> >> > -0.116533816255952, 0, 0, -0.223143551314210, 0, 0, 0, 0, >> >>> >> > -0.287682072451781, -0.248461359298500, -0.116533816255952, >> >>> >> > -0.287682072451781, 0, 0, 0, 0, 0, 0, 0, -0.116533816255952, 0))) >> >>> >> > >> >>> >> > >> >>> >> > would anybody have a hint on how to troubleshoot this? >> >>> >> > >> >>> >> > Thanks, >> >>> >> > >> >>> >> > Laurent >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > ------------------------------------------------------------------------- >> >>> >> > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! >> >>> >> > Studies have shown that voting for your favorite open source >> >>> >> > project, >> >>> >> > along with a healthy diet, reduces your potential for chronic >> >>> >> > lameness >> >>> >> > and boredom. Vote Now at >> >>> >> > http://www.sourceforge.net/community/cca08 >> >>> >> > _______________________________________________ >> >>> >> > rpy-list mailing list >> >>> >> > rpy-list@lists.sourceforge.net >> >>> >> > https://lists.sourceforge.net/lists/listinfo/rpy-list >> >>> >> > >> >>> >> > >> >>> > >> >>> > >> >> >> > >> > > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ rpy-list mailing list rpy-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rpy-list