[R] [R-pkgs] QCA version 0.4-5
QCA implements the Qualitative Comparative Analysis using a boolean minimization algorithm for data coded with presence/absence of the causal conditions that affects a phenomenon of interest. This new release has an experimental function that obtains the same exact solutions as the main minimization function, using a shortcut instead of the classical complete and exhaustive algorithm. This new function is faster and uses significantly less memory (50 MB compared to 1.5 GB for large datasets). It should appear soon on CRAN, feedback is welcome. -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bars' values on barplot
On Sunday 05 August 2007, Mark Wardle wrote: [...] So, try this for starters: my.values=1:5 x - barplot(my.values, ylim=c(0,7)) text(x, 0.4+my.values, wibble) Mark, you could use the pos argument from ?par: my.values=10:15 x - barplot(my.values, ylim=c(0,11)) text(x, my.values, wibble, pos=3) # always does what you want, whereas: text(x, 0.4+my.values, wibble) # doesn't look very nice HTH, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looping through all possible combinations of cases
Here is another (simpler?) solution: # your 1 column data is actually a vector myvalues - 1:10 names(myvalues) - LETTERS[1:10] # use the QCA package library(QCA) aa - createMatrix(rep(2, length(myvalues))) # set the number of combinations: 2, 3, 4 or whatever combinations - 2 sub.aa - aa[rowSums(aa) == combinations, ] result - apply(sub.aa, 1, function(x) sum(myvalues[x == 1])) names(result) - apply(sub.aa, 1, function(x) paste(names(myvalues)[x == 1], collapse=)) HTH, Adrian On Friday 27 July 2007, Dimitri Liakhovitski wrote: Hello! I have a regular data frame (DATA) with 10 people and 1 column ('variable'). Its cases are people with names ('a', 'b', 'c', 'd', 'e', 'f', etc.). I would like to write a function that would sum up the values on 'variable' of all possible combinations of people, i.e. 1. I would like to write a loop - in such a way that it loops through each possible pair of cases (i.e., ab, ac, ad, etc.) and sums up their respective values on 'variable' 2. I would like to write a loop - in such a way that it loops through each possible trio of cases (i.e., abc, abd, abe, etc.) and sums up their respective values on 'variable'. 3. I would like to write a loop - in such a way that it loops through each possible quartet of cases (i.e., abcd, abce, abcf, etc.) and sums up their respective values on 'variable'. etc. Then, at the end I want to capture all possible combinations that were considered (i.e., what elements were combined in it) and get the value of the sum for each combination. How should I do it? Thanks a lot! Dimitri -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Levene Test with R
On Thursday 05 July 2007, along zeng wrote: Hi All, is there Levene' test in R ? If not ,Could you give me some advice about Levene test with R? Thanks a lot! I am waiting for yours. From what I found in the archives, Levene is not very well suited for the homogeneity of variances test, and the recommended ones are: Ansari-Bradley for two groups (i.e. for t.test) Fligner-Killeen for three or more groups (i.e. for ANOVA) hth, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pie initial angle
Dear all, I'd like to produce a simple pie chart for a customer (I know it's bad but they insist), and I have some difficulties setting the initial angle. For example: pie(c(60, 40), init.angle=14) and pie(c(80, 20), init.angle=338) both present the slices in the same direction, where: pie(c(60, 40)) pie(c(80, 20)) present the slices in different directions. I read everything I could about init.angle argument, I even played with various formulas to compute it, but I just can't figure it out. How can I preserve the desired *direction* of the slices? Many thanks in advance, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Odp: pie initial angle
Thanks Petr and Gabor, On Tuesday 29 May 2007, Petr PIKAL wrote: From simple geometry pie(c(x, y), init.angle=(300+y/2*360/100)-360) shall do what you request. Although I am not sure if it is wise. Yes, this is what I want to do. I agree with all your points re initial angle, I just needed the position of the slices the way they are. My geometry seems to be poor towards innexistent :) All the best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data in packages... a list?
Dear all, Is it possible to add a list in the data folder when creating a new package? In other words, is data in packages restricted to data.frame only? Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.spss (package foreign) and SPSS 15.0 files
Charilaos Skiadas skiadas at hanover.edu writes: [...] I save as csv format all the time, and it offers me a choice to use the labels instead of the corresponding numbers. So you shouldn't have to lose that labelling. This is interesting and I tried to do this as well; I don't have access to an SPSS 15 (only to version 14 for the moment) but I cannot find the option to save as CSV. Is it a version 15 feature? Thanks, Adrian __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.spss (package foreign) and SPSS 15.0 files
On Saturday 14 April 2007 14:06, John Kane wrote: [...] I cannot remember if I have been using 14 or 14, I think it was 14 and I'm not near the machine to check. There does not seem to be a csv export in 14 but it looks like you can achieve the same thing by using one of the Excel outputs and then dumping the file from there. Oh, I thought about that too... the only trouble with Excel is its limit to about 256 columns and some 65000 rows. Any larger database would be truncated; of course, one could select the variables to export and create multiple Excel fiels but it seems rather an overkill. In the mean time, I use portable files or export in different formats using a comercial software like StatTransfer (this one is really good). Thanks, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting rows of a binary matrix
Hello Serguei, Is this what you need? myfunc - function(x) { create - function(idx) { rep.int(c(rep.int(0,2^(idx-1)), rep.int(1,2^(idx-1))), 2^x/2^idx) } sapply(rev(seq(x)), create) } myfunc(3) [,1] [,2] [,3] [1,]000 [2,]001 [3,]010 [4,]011 [5,]100 [6,]101 [7,]110 [8,]111 For numerical values only, this is faster than expand.grid(). Alternatively (for multiple values in separate varaibles), you could use the function createMatrix() in package QCA. HTH, Adrian On Thursday 22 February 2007 12:50, Serguei Kaniovski wrote: Hallo, The command: x - 3 mat - as.matrix(expand.grid(rep(list(0:1), x))) generates a matrix with 2^x columns containing the binary representations of the decimals from 0 to (2^x-1), here from 0 to 7. But the rows are not sorted in this order. How can sort the rows the ascending order of the decimals they represent, preferably without a function which converts binaries to decimals (which I have)? Alternatively, generate a matrix that has the rows sorted that way? Thanks, Serguei [[alternative HTML version deleted]] -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting rows of a binary matrix
And, for multiple bases: myfunc - function(cols, bases) { create - function(idx) { rep.int(c(sapply(seq_len(bases)-1, function(x) rep.int(x, bases^(idx-1, bases^cols/bases^idx) } sapply(rev(seq_len(cols)), create) } # For 3 columns in base 2 myfunc(3, 2) # For 3 columns in base 3 myfunc(3, 3) hth, Adrian On Thursday 22 February 2007 15:00, Adrian Dusa wrote: Hello Serguei, Is this what you need? myfunc - function(x) { create - function(idx) { rep.int(c(rep.int(0,2^(idx-1)), rep.int(1,2^(idx-1))), 2^x/2^idx) } sapply(rev(seq(x)), create) } myfunc(3) [,1] [,2] [,3] [1,]000 [2,]001 [3,]010 [4,]011 [5,]100 [6,]101 [7,]110 [8,]111 For numerical values only, this is faster than expand.grid(). Alternatively (for multiple values in separate varaibles), you could use the function createMatrix() in package QCA. HTH, Adrian On Thursday 22 February 2007 12:50, Serguei Kaniovski wrote: Hallo, The command: x - 3 mat - as.matrix(expand.grid(rep(list(0:1), x))) generates a matrix with 2^x columns containing the binary representations of the decimals from 0 to (2^x-1), here from 0 to 7. But the rows are not sorted in this order. How can sort the rows the ascending order of the decimals they represent, preferably without a function which converts binaries to decimals (which I have)? Alternatively, generate a matrix that has the rows sorted that way? Thanks, Serguei [[alternative HTML version deleted]] -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] jump in sequence
On Tuesday 30 January 2007 16:38, Peter Dalgaard wrote: [...snip...] Is this it? as.vector(outer(0:2,seq(4,22,9),+)) [1] 4 5 6 13 14 15 22 23 24 Indeed it is :)) Thanks, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] version 0.3 of QCA
Dear list members, A new version of the QCA package is now on CRAN. The QCA package implements the Quine-McCluskey algorithm for boolean minimizations, according to the Qualitative Comparative Analysis. Along with the additional improvements in version 0.3-1 (soon to be released on CRAN), this code is about 100 times faster than the previous major release (0.2-6). It can now reasonably work with 11 binary variables, finding a complete (and exact) solution in less than 2 minutes. This dramatic increase in speed is due to using a mathematical reduction instead of an algorithmic one. This approach openes the way for _exact_ multi-value minimizations, and an even better (and faster) approach is searched for the future versions. Best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing two matrices
Hi Christos, It's... more or less the same thing. I was looking for a matrix.to.matrix.run.me() function :) Cheers, Adrian On Sunday 21 January 2007 00:56, Christos Hatzis wrote: Here is a slightly more compact version of your function which might run faster (I did not test timings) since it does not use the sum: apply(mat2, 1, function(x) which(apply(mat1, 1, function(y) all(x == y)) == TRUE)) -Christos -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing two matrices
Hello Marc and Dimitris, There was an error in my first example (therefore not reproducible), so mat1 - expand.grid(0:2, 0:2, 0:2) mat2 - mat1[c(19, 16, 13, 24, 8), ] Your solution works if and only if the elements in both matrices are unique. Unfortunately, it does not apply for my matrices where elements do repeat (only the rows are unique). which(apply(matrix(mat1 %in% mat2, dim(mat1)), 1, all)) integer(0) which((mat1 %in% mat2)[1:nrow(mat1)]) integer(0) Another solution would be using base 3 operations: mat1 - expand.grid(0:2, 0:2, 0:2)[, 3:1] mat2 - mat1[c(19, 16, 13, 24, 8), ] mylines - mat2[, 1] for (i in 2:ncol(mat2)) {mylines - 3*mylines + mat2[, i]} mylines + 1 [1] 19 16 13 24 8 I was still hoping for a direct matrix function to avoid the for() loop. Thanks, Adrian On Sunday 21 January 2007 01:06, Marc Schwartz wrote: On Sun, 2007-01-21 at 00:14 +0200, Adrian Dusa wrote: Dear helpeRs, I have two matrices: mat1 - expand.grid(0:2, 0:2, 0:2) mat2 - aa[c(19, 16, 13, 24, 8), ] where mat2 is always a subset of mat1 I need to find the corersponding row numbers in mat1 for each row in mat2. For this I have the following code: apply(mat2, 1, function(x) { which(apply(mat1, 1, function(y) { sum(x == y) }) == ncol(mat1)) }) The code is vectorized, but I wonder if there is a simpler (hence faster) matrix computation that I miss. Thank you, Adrian I have not fully tested this, but how about: mat1 - matrix(1:20, ncol = 4, byrow = TRUE) mat2 - matrix(1:60, ncol = 4, byrow = TRUE) mat2 - mat2[sample(15), ] mat1 [,1] [,2] [,3] [,4] [1,]1234 [2,]5678 [3,]9 10 11 12 [4,] 13 14 15 16 [5,] 17 18 19 20 mat2 [,1] [,2] [,3] [,4] [1,] 13 14 15 16 [2,]5678 [3,] 41 42 43 44 [4,] 17 18 19 20 [5,] 21 22 23 24 [6,] 25 26 27 28 [7,] 53 54 55 56 [8,]9 10 11 12 [9,] 57 58 59 60 [10,] 33 34 35 36 [11,] 49 50 51 52 [12,] 45 46 47 48 [13,]1234 [14,] 29 30 31 32 [15,] 37 38 39 40 which(apply(matrix(mat2 %in% mat1, dim(mat2)), 1, all)) [1] 1 2 4 8 13 HTH, Marc Schwartz -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing two matrices
On Sunday 21 January 2007 12:04, Dimitris Rizopoulos wrote: I think the following should work in your case: mat1 - data.matrix(expand.grid(0:2, 0:2, 0:2)) mat2 - mat1[c(19, 16, 13, 24, 8), ] ind1 - apply(mat1, 1, paste, collapse = /) ind2 - apply(mat2, 1, paste, collapse = /) match(ind2, ind1) Oh yes, I thought about that too. It works fast enough for small matrices, but I deal with very large ones. Using paste() on such matrices decreases the speed dramatically. Thanks again, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple bases to decimal (was: comparing two matrices)
Hi again, I was contemplating the solution using base 3: set.seed(3) mat2 - matrix(sample(0:2, 15, replace=T), 5, 3) Extracting the line numbers is simple: bases - c(3, 3, 3)^(2:0) # or just 3^(2:0) colSums(apply(mat2, 1, function(x) x*bases)) + 1 [1] 7 23 25 8 1 The problem is sometimes the columns have different number of levels, as in: mat1 - expand.grid(0:2, 0:2, 0:1)[,3:1] Is there any chance to combine different bases in order to obtain the corresponding line numbers? I thought of something like: bases - c(3, 3, 2)^(2:0) but it doesn't work (sigh). Thanks for any hint, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple bases to decimal (was: comparing two matrices)
On Sunday 21 January 2007 16:30, jim holtman wrote: I think you are computing your bases in the wrong way. If the data represents 3 columns with base 3,3,2, then the multiplier has to be c(6,2,1) not c(9,3,1). I think this should compute it correctly: # create a matrix of all combination of bases 3,3,2 mat1 - expand.grid(0:1, 0:2, 0:2)[,3:1] base - c(3,3,2) # define the bases # now create the multiplier mbase - c(rev(cumprod(rev(base))),1)[-1] # show the data mat1 base mbase # combine with original cbind(mat1, conv=colSums(apply(mat1, 1, function(x) x*mbase))) YES! Thank you so much Jim, this made my day :)) Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] comparing two matrices
Dear helpeRs, I have two matrices: mat1 - expand.grid(0:2, 0:2, 0:2) mat2 - aa[c(19, 16, 13, 24, 8), ] where mat2 is always a subset of mat1 I need to find the corersponding row numbers in mat1 for each row in mat2. For this I have the following code: apply(mat2, 1, function(x) { which(apply(mat1, 1, function(y) { sum(x == y) }) == ncol(mat1)) }) The code is vectorized, but I wonder if there is a simpler (hence faster) matrix computation that I miss. Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question of substitute
Dear Prof. Ripley, Thank you for this extensive explanation. It looks like my first solution is similar to (b): creating new variables inside the wrapper (and new data if not missing). This course is only introductory, with simple models, and I do point students to each test separately if they want more complicated things. I'm looking forward to the release of the 2.5.0 version. Best regards, Adrian On Thursday 11 January 2007 03:08, Prof Brian Ripley wrote: The 'Right Thing' is for oneway.test() to allow a variable for the first argument, and I have altered it in R-patched and R-devel to do so. So if your students can make use of R-patched that would be the best solution. If not, perhaps you could make a copy of oneway.test from R-patched available to them. Normally I would worry about namespace issues, but it seems unlikely they would matter here: if they did assignInNamespace is likely to work to insert the fix. Grothendieck's suggestions are steps towards a morass: they may work in simple cases but can make more complicated ones worse (such as looking for 'data' in the wrong place). These model fitting functions have rather precise requirements for where they look for their components: 'data' the environment of 'formula' the environment of the caller and that includes where they look for 'data'. It is easy to use substitute or such to make a literal formula out of 'formula', but doing so changes its environment. So one needs to either (a) fix up an environment within which to evaluate the modified call that emulates the scoping rules or (b) create a new 'data' that has references to all the variables needed, and just call the function with the new 'formula' and new 'data'. At first sight model.frame() looks the way to do (b), but it is not, since if there are function calls in the formula (e.g. log()) the model frame includes the derived variables and not the original ones. There are workarounds (e.g. in glmmPQL), like using all.vars, creating a formula from that, setting its environment to that of the original function and then calling model.frame. This comes up often enough that I have contemplated adding a solution to (b) to the stats package. Doing either of these right is really pretty complicated, and not something to dash off code in a fairly quick reply (or even to check that the code in glmmPQL was general enough to be applicable). -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question of substitute
On Wednesday 10 January 2007 19:03, Gabor Grothendieck wrote: Looks like oneway.test has been changed for R 2.5.0. Paste the code in this file: https://svn.r-project.org/R/trunk/src/library/stats/R/oneway.test.R into your session. Then fun.2 from your post will work without the workarounds I posted: fun.2(values ~ group) Brilliant :) Super fast change, this is why I love R. Cheers, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a question of substitute
Hi all, I want to write a wrapper for an analysis of variance and I face a curious problem. Here are two different wrappers: fun.1 - function(formula) { summary(aov(formula)) } fun.2 - function(formula) { oneway.test(formula) } values - c(15, 8, 17, 7, 26, 12, 8, 11, 16, 9, 16, 24, 20, 19, 9, 17, 11, 8, 15, 6, 14) group - rep(1:3, each=7) # While the first wrapper works just fine: fun.1(values ~ group) # the second throws an error: fun.2(values ~ group) Error in substitute(formula)[[2]] : object is not subsettable ### I also tried binding the two vectors in a data.frame, with no avail. I did find a hack, creating two new vectors inside the function and creating a fresh formula, so I presume this has something to do with environments. Could anybody give me a hint on this? Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question of substitute
On Tuesday 09 January 2007 15:14, Gabor Grothendieck wrote: oneway.test is using substitute on its arguments so its literally getting formula rather than the value of formula. Ah-haa... I understand now. Thanks for the tips, they both work as expected. Best, Adrian Try these: fun.3 - function(formula) { mc - match.call() mc[[1]] - as.name(oneway.test) eval.parent(mc) } fun.3(values ~ group) fun.4 - function(formula) { do.call(oneway.test, list(formula)) } fun.4(values ~ group) On 1/9/07, Adrian Dusa [EMAIL PROTECTED] wrote: Hi all, I want to write a wrapper for an analysis of variance and I face a curious problem. Here are two different wrappers: fun.1 - function(formula) { summary(aov(formula)) } fun.2 - function(formula) { oneway.test(formula) } values - c(15, 8, 17, 7, 26, 12, 8, 11, 16, 9, 16, 24, 20, 19, 9, 17, 11, 8, 15, 6, 14) group - rep(1:3, each=7) # While the first wrapper works just fine: fun.1(values ~ group) # the second throws an error: fun.2(values ~ group) Error in substitute(formula)[[2]] : object is not subsettable ### I also tried binding the two vectors in a data.frame, with no avail. I did find a hack, creating two new vectors inside the function and creating a fresh formula, so I presume this has something to do with environments. Could anybody give me a hint on this? Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Adrian Dusa Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question of substitute
On Tuesday 09 January 2007 15:41, Prof Brian Ripley wrote: oneway.test expects a literal formula, not a variable containing a formula. The help page says formula: a formula of the form 'lhs ~ rhs' where 'lhs' gives the sample values and 'rhs' the corresponding groups. Furthermore, if you had foo.2 - function() oneway.test(value ~ group) it would still not work, as data: an optional matrix or data frame (or similar: see 'model.frame') containing the variables in the formula 'formula'. By default the variables are taken from 'environment(formula)'. I could show you several complicated workarounds, but why do you want to do this? Thank you for your reply. The data argument was exactly the next problem I faced. My workaround involves checking if(missing(data)) then uses different calls to oneway.test(). I am certainly interested in other solutions, this one is indeed limited. I do this for the students in the anova class, checking first the homogeneity of variances with fligner.test(), printing the p.value and based on that changing the var.equal argument in the oneway.test() It's just for convenience, but they do like having it all-in-one. Best regards, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dashed lines and SVG files devSVG(/folderul/unde/salvez/myplot.svg, width=10, height=10) plot(1:10, 1:10) dev.off()
Dear helpers, I have a question about the SVG device. It works fine, the SVG file is indeed produced, only the graphic differs from the R window. In the SVG file the dashed line is just a regular plain one. My toy example is: library(RSvgDevice) devSVG(myplot.svg, width=10, height=10) plot(1:10) abline(v=5, lty=¨dashed¨) dev.off() Is there anything more (or different) I should do? Many thanks in advance, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dashed lines and SVG files
Sorry for duplicating the message, the previous had an unintended subject line... Dear helpers, I have a question about the SVG device. It works fine, the SVG file is indeed produced, only the graphic differs from the R window. In the SVG file the dashed line is just a regular plain one. My toy example is: library(RSvgDevice) devSVG(myplot.svg, width=10, height=10) plot(1:10) abline(v=5, lty=¨dashed¨) dev.off() Is there anything more (or different) I should do? Many thanks in advance, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace some objects?
On Tuesday 19 December 2006 13:49, Michael Kubovy wrote: On Dec 19, 2006, at 3:05 AM, Zhang Jian wrote: I want to replace some objects in one row or column.For example, One colume: a,b,a,c,b,b,a,a,c. I want to replace a with 1, b with 2, and c with 3. Like this: 1,2,1,3,2,2,1,1,3. let - c('a', 'b', 'a', 'c', 'b', 'b', 'a', 'a', 'c') library(car) num - recode(let, 'a' = 1; 'b' = 2; else = 3 ) Or, since the initial vector has letters only: as.numeric(factor(let)) [1] 1 2 1 3 2 2 1 1 3 Hth, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Count cases by indicator
Dear Serguei and Andy, I was away for a few days so I appologize for this late reply. I cannot help but noticing Serguei's problem is somewhat suited to the QCA package. For example, the 2^k combinations can be created simply with: library(QCA) cmat - createMatrix(9) As to the problem itself, the solution would be: m - as.data.frame(matrix(df$x, ncol=9, byrow=TRUE)) rownames(m) - levels(df$case) m$OUT - 0 truthTable(m, outcome=OUT, show.cases=TRUE) The result seconds Andy's result. If Serguei wants the entire matrix, then use the inside argument: truthTable(df, outcome=OUT, inside=TRUE) I hope it helps, Adrian On Monday 04 December 2006 15:24, Liaw, Andy wrote: I might be missing something, but the data you showed don't seem to match your expectation. Firstly, 1 in binary is 511 in decimal, so your coordinates are off by 1. Secondly, for the data you've shown, the matrix equivalent look like: m - matrix(df$x, ncol=9, byrow=TRUE) rownames(m) - levels(df$cases) print(m) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] 093/0188001011111 093/0206000000000 093/0216011111011 093/0305011111111 093/0325000000000 093/0449000000000 093/0473001111111 093/0499001111111 The counts of unique occurances are: table(do.call(paste, c(as.data.frame(m), sep=) 0 00101 00111 01011 0 3 1 2 1 1 which do not agree with yours. If I understood what you wanted, I would do: R table(rowSums(matrix(2^(0:8) * df$x, ncol=9, byrow=TRUE))) 0 446 500 508 510 3 1 1 2 1 Andy From: Serguei Kaniovski Hi, In the data below, case represents cases, x binary states. Each case has exactly 9 x, ie is a binary vector of length 9. There are 2^9=512 possible combinations of binary states in a given case, ie 512 possible vectors. I generate these in the order of the decimals the vectors represent, as: cmat-as.matrix(expand.grid(rep(list(0:1),9))) cmat-cmat[nrow(cmat):1,ncol(cmat):1] cmat contains the binary vectors as rows. QUESTION: I would like to know how often each of the 512 vectors occurs in case. With these data, the output should be a vector with 2^9=512 coordinates, having 2,2,1,3, as, respectively, the coordinate number 129, 193, 449, 512, and zeros in all other coordinates. Thank you for your help, Serguei df-read.delim(clipboard,sep=;) DATA: case;x 093/0188;0 093/0188;0 093/0188;1 093/0188;0 093/0188;1 093/0188;1 093/0188;1 093/0188;1 093/0188;1 093/0206;0 093/0206;0 093/0206;0 093/0206;0 093/0206;0 093/0206;0 093/0206;0 093/0206;0 093/0206;0 093/0216;0 093/0216;1 093/0216;1 093/0216;1 093/0216;1 093/0216;1 093/0216;0 093/0216;1 093/0216;1 093/0305;0 093/0305;1 093/0305;1 093/0305;1 093/0305;1 093/0305;1 093/0305;1 093/0305;1 093/0305;1 093/0325;0 093/0325;0 093/0325;0 093/0325;0 093/0325;0 093/0325;0 093/0325;0 093/0325;0 093/0325;0 093/0449;0 093/0449;0 093/0449;0 093/0449;0 093/0449;0 093/0449;0 093/0449;0 093/0449;0 093/0449;0 093/0473;0 093/0473;0 093/0473;1 093/0473;1 093/0473;1 093/0473;1 093/0473;1 093/0473;1 093/0473;1 093/0499;0 093/0499;0 093/0499;1 093/0499;1 093/0499;1 093/0499;1 093/0499;1 093/0499;1 093/0499;1 -- ___ Austrian Institute of Economic Research (WIFO) Name: Serguei Kaniovski P.O.Box 91 Tel.: +43-1-7982601-231 Arsenal Objekt 20 Fax: +43-1-7989386 1103 Vienna, Austria Mail: [EMAIL PROTECTED] http://www.wifo.ac.at/Serguei.Kaniovski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- --- Notice: This e-mail message, together with any attachments,...{{dropped}} -- Adrian DUSA Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] weight cases?
Dear all, This is probably a stupid question for which I have a solution, which unfortunately is not as straighforward as I'd like. I wonder if there's a simple way to apply a weighting variable for the cases of a dataframe (well I'm sure there is, I just cannot find it). My toy example: my.data - data.frame(var1=c(c, e, a, d, b), var2=c(E, B, A, C, D), weight=c(1, 2, 1, 1, 1)) table(my.data$var1, my.data$var2) A B C D E a 1 0 0 0 0 b 0 0 0 1 0 c 0 0 0 0 1 d 0 0 1 0 0 e 0 1 0 0 0 Applying the weight variable, the table should yield a value of 2 for the eB combination: table(my.data$var1, my.data$var2) A B C D E a 1 0 0 0 0 b 0 0 0 1 0 c 0 0 0 0 1 d 0 0 1 0 0 e 0 2 0 0 0 Thanks in advance, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weight cases?
Thanks for this Gabor, Sometimes weights can take various values, like 0.9 rep(letters[1:3], c(1, 0.9, 1.6)) [1] a c What if the weight variable would be: my.data$weight - c(0.4, 2, 1.3, 0.9, 1) The way I found the solution was to compute the unweighted table, then find the weight for each unique combination and multiply that with the corresponding row-column entry in the table. The solution though is not very satisfactory: my.data$var1 - as.factor(my.data$var1) my.data$var2 - as.factor(my.data$var2) total - expand.grid(levels(my.data$var1), levels(my.data$var2)) rowsmy.data - apply(unique(my.data[,1:2]), 1, paste, collapse=) rowstotal - apply(total, 1, paste, collapse=) total$weight - 0 total$weight[sapply(rowsmy.data, function(x) which(rowstotal == x))] - unique(my.data)[,3] (unweighted - table(my.data$var1, my.data$var2)) round(unweighted*total$weight, 0) Yet another question: how would the weight variable be applied to correlate two numerical variables? Best, Adrian On Saturday 14 October 2006 16:00, Gabor Grothendieck wrote: Try this: table(lapply(my.data, rep, my.data$weight)[1:2]) On 10/14/06, Adrian Dusa [EMAIL PROTECTED] wrote: Dear all, This is probably a stupid question for which I have a solution, which unfortunately is not as straighforward as I'd like. I wonder if there's a simple way to apply a weighting variable for the cases of a dataframe (well I'm sure there is, I just cannot find it). My toy example: my.data - data.frame(var1=c(c, e, a, d, b), var2=c(E, B, A, C, D), weight=c(1, 2, 1, 1, 1)) table(my.data$var1, my.data$var2) A B C D E a 1 0 0 0 0 b 0 0 0 1 0 c 0 0 0 0 1 d 0 0 1 0 0 e 0 1 0 0 0 Applying the weight variable, the table should yield a value of 2 for the eB combination: table(my.data$var1, my.data$var2) A B C D E a 1 0 0 0 0 b 0 0 0 1 0 c 0 0 0 0 1 d 0 0 1 0 0 e 0 2 0 0 0 Thanks in advance, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weight cases?
On Saturday 14 October 2006 16:52, Gabor Grothendieck wrote: Try this (and round the result to make to it comparable to your calculation): xtabs(weight ~ var1 + var2, my.data) Oh yes... :) It was so simple. Thanks for the cov.wt() as well. Regards, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to repeat vectors ?
Maybe this one? MyMatrix - matrix(1:4, nrow=2) MyMatrix [,1] [,2] [1,]13 [2,]24 MyMatrix[rep(seq(nrow(MyMatrix)), each=2), ] [,1] [,2] [1,]13 [2,]13 [3,]24 [4,]24 HTH, Adrian On Saturday 30 September 2006 09:33, Tong Wang wrote: I just figured out a way to do this: rep.vec - function(X,n) return(t(array(rep(X,n),c(length(X),n Then,apply(MyMatrix, 2, rep.vec,2) Is there a better way ? Is there an internal function to repeat a vector or matrix ? Thanks a lot. - Original Message - From: Tong Wang [EMAIL PROTECTED] Date: Friday, September 29, 2006 11:23 pm Subject: How to repeat vectors ? To: r-help@stat.math.ethz.ch Hi, If I have a matrix , say a11 a12 a21 a22 Is there a routine to get: a11 a12 a11 a12 a21 a22 a21 a22 Thanks a lot for any help. best -- Adrian DUSA Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] delete a entire vector of a dataframe
This works too: t.d$V712 - NULL On Thursday 21 September 2006 22:28, Gavin Simpson wrote: On Thu, 2006-09-21 at 20:01 +0200, Thomas Preuth wrote: delete a entire vector of a dataframe Hello, i want to delete a vector and tried rm (t.d$V712). This did not work, message was, could not find variable. I thought the $ defines the vectro in a dataframe, when I just type t.d$V712 the content of this vector is displayed. Greetings, Thomas You can't do that, and that is not what the error message said exactly - which should have told you something was wrong with your thinking as it also said 1: remove: variable $ was not found. Instead, copy over the object, minus the column you want to delete: dat - as.data.frame(matrix(rnorm(100), nrow = 10)) names(dat) - paste(Var, 1:10, sep = _) dat # now we don't want column Var_6 dat - dat[, -6] # or if we don't know which column is Var_6 you could do not.want - which(names(dat) %in% Var_7) # now don't want Var_7 dat - dat[, -not.want] dat This can be extended to many variables: not.want - which(names(dat) %in% c(Var_10, Var_2, Var_8)) dat - dat[, -not.want] dat # only 1, 3, 4, 5, 9 left HTH G -- Adrian DUSA Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test the tcltk package
Dear all, Could anybody using (K)ubuntu (or Linux in general) confirm if this is a general problem or it's just my box? The problem relates to the Options window in the Rcmdr package (which it looks fine in John Fox's Quantian). The last option (Default font) is stubborn and won't be set; it behaves strangely (e.g. I type 3 and it appears 2). This options should have a button to drag left and right, but in my configuration this is missing; at this link you can see how it looks like: http://www.roda.ro/Options.png What should I do to test if the R tcltk package works fine? I have R 2.3.1, tcl and tk version 8.4 (dev packages installed as well). Thanks in advance, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Site Search directly from Firefox's address bar
On Friday 18 August 2006 10:08, Romain Francois wrote: Le 17.08.2006 20:56, Adrian Dusa a écrit : [...] It breaks also every usage of the google feeling lucky default behaviour which is really useful I think. There are R related firefox search plugins on mycroft. Find more info on that on the wiki : http://wiki.r-project.org/rwiki/doku.php?id=tips:misc:firefox-search-plugins Thanks, very nice! Already added a couple of search plugins. Well, it's just a matter of taste, what one use Firefox's address bar for: Google Feeling Lucky vs. R Site Search (I prefer the later). All the best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Site Search directly from Firefox's address bar
Dear list, For all those interested who use Firefox as the main browser, here is a quick way to make R related searches: type about:config in the address bar search for keyword.url and modify it to http://finzi.psych.upenn.edu/cgi-bin/namazu.cgi?idxname=functionsidxname=docsidxname=Rhelp02aquery=; From now on, every keyword(s) you type in the address bar will take you directly to the first page of hits at http://finzi.psych.upenn.edu I found this very helpful. Best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Site Search directly from Firefox's address bar
On Thursday 17 August 2006 21:41, Peter Dalgaard wrote: [...] Breaks the feature that you get to www.r-project.org just by typing r, though... Oh, this is very simple to fix. I created a bookmark named R with the above location and assigned it a keyword r. Now, everytime I type r in the address bar it takes me to www.r-project.org :) -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RE : tcltk library on linux
On Friday 11 August 2006 10:31, Yohan CHOUKROUN wrote: Thank you for your answer but I already use the .deb package. Also I have compiled the source code, but it is the same result... I have already the same error.. I 'm going to be crazy ;-) Has anyone got the same problem (and found the solution!) ? Thanks in advance Yohan Only a week ago it was a similar thread, and it was solved by installing the .deb package from CRAN. Assuming you use the latest version Dapper (you didn't specified), add this line to your sources.list: deb http://cran.R-project.org/bin/linux/ubuntu/ dapper/ Then: $ sudo apt-get update $ sudo apt-get install r-cran-rcmdr (as Dirk Eddelbuettel advised). It _should_ work flawlessly. HTH, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] missing documentation entries
Dear list, When creating a package, there are always many little utility functions that belong to the internal kitchen of the main, documented functions. Now, when checking the sources with R CMD check, I get a warning for those little functions that are not documented. I would have two questions: - is it mandatory to document _all_ functions (will the source package be rejected by CRAN if otherwise)? - if not, is there a way to tell R which are the functions that I don't want to document? Thanks, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] missing documentation entries
On Wednesday 09 August 2006 14:37, Prof Brian Ripley wrote: This is discussed in `Writing R Extensions', which both points you to the 'internal' keyword, and (in the current version) mentions using name spaces to hide such functions. This really was a question for R-devel: please do study the posting guide. `R-devel is intended for questions and discussion about code development in R.' Thank you very much for your reply, I'll post to R-devel from now on. It seems to me that name spaces are the solution for my problem. Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] objects and environments
Dear list, I have two functions created in the same environment, fun1 and fun2. fun2 is called by fun1, but fun2 should use an object which is created in fun1 fun1 - function(x) { ifelse(somecondition, bb - o, bb - *) ## mymatrix is created, then myresult - apply(mymatrix, 1, fun2) } fun2 - function(idx) { if (bb == o) { # do something with idx } else { # do something else with idx } } What should I do to have bb available in fun2? I tried everything I could with sys.parent(), sys.frame(), parent.env() but it just doesn't want to work. I have three solutions but none of them satisfactory; inside fun1: 1. assign(bb, aa, .GlobalEnv) # don't want to do that, do I? 2. assign(bb, aa, 1) # for some reason aa appears in the .GlobalEnv anyway 3. pass bb as an argument to fun2, but this would require: apply(mymatrix, 1, function(idx) fun2(idx, bb)) # which is not elegant I played further with assign and get, but there's something I'm missing: fun1 - function() { e2 - new.env() assign(bb, 4, e2) fun2() } fun2 - function(idx) { get(bb, e2) } fun1() Error in get(bb, e2) : object e2 not found Any hint would be highly appreciated, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] objects and environments
On Wednesday 09 August 2006 18:31, Dimitris Rizopoulos wrote: try this: fun1 - function(x) { environment(fun2) - environment() ifelse(somecondition, bb - o, bb - *) ## mymatrix is created, then myresult - apply(mymatrix, 1, fun2) } Beautiful :) Thanks very much Dimitris, I was out of energy after several hours of struggling with this. Best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] objects and environments
On Wednesday 09 August 2006 19:14, Gabor Grothendieck wrote: Dmitris has already provided the solution but just throught I would' mention that your third alternative can be written: apply(mymatrix, 1, fun2, bb = bb) (assuming fun2 has arguments idx and bb) which is not nearly so ugly so you might reconsider whether its ok for you to just pass bb. Aah-aaa!! :) So that's the way to do it... I don't know how many times I read the help from apply and I missed it every time. Well, I learned many things today, I feel much better now :) Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tcltk package
On Tuesday 01 August 2006 19:24, John McHenry wrote: [...] Yes, I built R myself. I couldn't find a debian package for R 2.3.1. The latest available is 2.2.1. Oh, but there is... right on CRAN. For Dapper just add this line to your sources.list: deb http://cran.R-project.org/bin/linux/ubuntu/ dapper/ This repository has lots of other packages compiled for Ubuntu, feel free to take a look. HTH, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting from a matrix w/o for-loop
Hi, On Friday 28 July 2006 20:21, Horace Tso wrote: Unless there is another level of complexity that i didn't see here, wouldn't it be a simply application of sapply as follow, sapply( 1:dim(M2)[[1]], function(x) M1[M2[x,1], M2[x,2]] ) Andy's previous answer involving matrix indexing (M1[M2]) is simpler but just for the sake of it, since we're dealing with matrices it is not a case of sapply but of _apply_: apply(M2, 1, function(x) M1[x[1], x[2]]) My 2c, Adrian -- Adrian DUSA Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting rid of for loops
Hi Kevin, Regarding your first question, try this: library(combinat) all.pairs - combn2(5:40) marker1 - as.matrix(names(qtl)[all.pairs[, 1]]) marker1 - as.matrix(names(qtl)[all.pairs[, 2]]) myfun - function(idx) { summary(aov(qtl$CPP ~ qtl[,idx[1]] * qtl[,idx[2]]))[[1]]$Pr(F)[3]) } p.interaction - as.matrix(apply(all.pairs, 1, myfun) HTH, Adrian On Monday 17 July 2006 05:18, Kevin J Emerson wrote: Hello R-users! I have a style question. I know that for loops are somewhat frowned upon in R, and I was trying to figure out a nice way to do something without using loops, but figured that i could get it done quickly using them. I am now looking to see what kind of tricks I can use to make this code a bit more aesthetically appealing to other R users (and learn something about R along the way...). Here's the problem. I have a data.frame with 4 columns of dependent variables and then ~35 columns of predictor variables (factors) [for those interested, it is a qtl problem, where the predictors are genotypes at DNA markers and the dependent variable is a biological trait]. I want to go through all pairwise combinations of predictor variables and perform an anova with two predictors and their interaction on a given dependent variable. I then want to store the p.value of the interaction term, along with the predictor variable information. So I want to end up with a dataframe at the end with the two variable names and the interaction p value in each row, for all pairwise combinations of predictors. I used the following code: # qtl is the original data.frame, and my dependent var in this case is # qtl$CPP. marker1 - NULL marker2 - NULL p.interaction - NULL for ( i in 5:40) { # cols 5 - 41 are the predictor factors for (j in (i+1):41) { marker1 - rbind(marker1,names(qtl)[i]) marker2 - rbind(marker2,names(qtl)[j]) tmp2 - summary(aov(tmp$CPP ~ tmp[,i] * tmp[,j]))[[1]] p.interaction - rbind(p.interaction, tmp2$Pr(F)[3]) } } I have two questions: (1) is there a nicer way to do this without having to invoke for loops? (2) my other dependent variables are categorical in nature. I need basically the same information - I am looking for information regarding the interaction of predictors on a categorical variable. Any ideas on what tests to use? (I am new to analysis of all-categorical data). Thanks in advance! Kevin -- -- Kevin Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Getting rid of for loops
Hi again, There is a slight error there, it should have been marker2 at the fourth line: all.pairs - combn2(5:40) marker1 - names(qtl)[all.pairs[, 1]] marker2 - names(qtl)[all.pairs[, 2]] myfun - function(idx) { summary(aov(qtl$CPP ~ qtl[,idx[1]] * qtl[,idx[2]]))[[1]]$Pr(F)[3]) } p.interaction - apply(all.pairs, 1, myfun) Actually, you don't need as.matrix there, just cbind all your vectors to obtain the final dataframe: finally - as.data.frame(cbind(marker1, marker2, p.interaction)) Adrian On Monday 17 July 2006 18:40, Adrian DUSA wrote: Hi Kevin, Regarding your first question, try this: library(combinat) all.pairs - combn2(5:40) marker1 - as.matrix(names(qtl)[all.pairs[, 1]]) marker1 - as.matrix(names(qtl)[all.pairs[, 2]]) myfun - function(idx) { summary(aov(qtl$CPP ~ qtl[,idx[1]] * qtl[,idx[2]]))[[1]]$Pr(F)[3]) } p.interaction - as.matrix(apply(all.pairs, 1, myfun) HTH, Adrian On Monday 17 July 2006 05:18, Kevin J Emerson wrote: Hello R-users! I have a style question. I know that for loops are somewhat frowned upon in R, and I was trying to figure out a nice way to do something without using loops, but figured that i could get it done quickly using them. I am now looking to see what kind of tricks I can use to make this code a bit more aesthetically appealing to other R users (and learn something about R along the way...). Here's the problem. I have a data.frame with 4 columns of dependent variables and then ~35 columns of predictor variables (factors) [for those interested, it is a qtl problem, where the predictors are genotypes at DNA markers and the dependent variable is a biological trait]. I want to go through all pairwise combinations of predictor variables and perform an anova with two predictors and their interaction on a given dependent variable. I then want to store the p.value of the interaction term, along with the predictor variable information. So I want to end up with a dataframe at the end with the two variable names and the interaction p value in each row, for all pairwise combinations of predictors. I used the following code: # qtl is the original data.frame, and my dependent var in this case is # qtl$CPP. marker1 - NULL marker2 - NULL p.interaction - NULL for ( i in 5:40) { # cols 5 - 41 are the predictor factors for (j in (i+1):41) { marker1 - rbind(marker1,names(qtl)[i]) marker2 - rbind(marker2,names(qtl)[j]) tmp2 - summary(aov(tmp$CPP ~ tmp[,i] * tmp[,j]))[[1]] p.interaction - rbind(p.interaction, tmp2$Pr(F)[3]) } } I have two questions: (1) is there a nicer way to do this without having to invoke for loops? (2) my other dependent variables are categorical in nature. I need basically the same information - I am looking for information regarding the interaction of predictors on a categorical variable. Any ideas on what tests to use? (I am new to analysis of all-categorical data). Thanks in advance! Kevin -- -- Kevin Emerson Center for Ecology and Evolutionary Biology 1210 University of Oregon Eugene, OR 97403 USA [EMAIL PROTECTED] -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] new package QCAGUI
Dear list members, I'm pleased to let you know there's a new package on CRAN called QCAGUI, a graphical user interface for the QCA package. This is a stripped down version of John Fox's Rcmdr package, plus a couple of menus for QCA. My thanks to John Fox for his encouragement and advice. Regards, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] checking package dependencies
On Saturday 20 May 2006 20:14, Uwe Ligges wrote: Richard M. Heiberger wrote: [...] 1. cygwin is not supported. 2. Access is denied suggests this is not an R but a problem of your (OS/cygwin ? ) setup. Uwe Ligges I also thank you for the answer. The problem seem to have vanished (and I haven't done anything in particular). When it didn't work, I remember I checked the permissions and everything was OK. Really have no idea what went wrong, but now it works. Best, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] checking package dependencies
Dear all, I seem to be unable to check a source package since I upgraded R to 2.3.0 (Ubuntu Linux 5.1). I get this: * checking package dependencies ... ERROR tools:::.check_package_depends(/home/adi/Work/QCAGUI) I have even tried with R-patched, same result. My Renviron does specify the path to the installed packages (and all depending packages are installed): R_LIBS=${R_LIBS-'/home/adi/Installed/R/site-library:/usr/local/lib/R/site-library:/usr/local/lib/R/library'} Is there something changed about defining R_LIBS? Thank you in advance, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] choosing a particular object
Hello all, I'd like to create a function which would do some analysis on a particular object, which should be specified in advance. Something like: ls() [1] aa bb cc Object - bb var.name - q2 testfunction - function(obj.name, var.name) { temp - give.me.the.object.called(Object) table(temp[, var.name]) } This should perfom the same thing as: table(bb$q2) Is this possible? TIA, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] choosing a particular object
Thanks, it's exactly what I want. Adrian On Friday 31 March 2006 12:59, Adaikalavan Ramasamy wrote: Try test.fn - function(obj.name, var.name=q2){ stopifnot( is.character(obj.name) is.character(var.name) ) x - subset( get(obj), select=var.name ) table(x) } -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ifelse problem
Dear all, There is something I'm missing in order to understand the following behavior: aa - c(test, name) ifelse(any(nchar(aa) 3), aa[-which(nchar(aa) 3)], aa) [1] test any(nchar(aa) 3) [1] FALSE Shouldn't the ifelse function return the whole aa vector? Using if and else separately, I get the correct result... if (any(nchar(aa) 3)) { aa[-which(nchar(aa) 3)] } else { aa } [1] test name Thanks in advance, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ifelse problem
On Friday 10 March 2006 16:31, Uwe Ligges wrote: [...] aa[!(nchar(aa) 3)] Thanks very much, I got it now. All the best, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] QCA adn Fuzzy
Dear Prof. Gott, On Monday 06 March 2006 14:37, R Gott wrote: Does anybody know of aything that will help me do Quantitiative Comparative Analysis (QCA) and/or Fuzzy set analysis?? Or failing that Quine? ta rg Prof R Gott Durham Univesrity UK There is a package called QCA which (in its first release) performs only crisp set analysis. I am currently adapting a Graphical User Interface, but the functions are nevertheless usefull. For fuzzy set analysis, please consider Charles Ragin's web site http://www.u.arizona.edu/%7Ecragin/fsQCA/index.shtml which offers a software (still not complete, though). Also to consider is a good software called Tosmana (http://www.tosmana.org/) which does multi-value QCA. I am considering writing the inclusion algorithms in the next releases of my package, but it is going to take a little while. Any contributions and/or feedback are more than welcome. I hope this helps you, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] new package QCA
Dear list members, I am pleased to let you know that R met with QCA - Qualitative Comparative Analysis. This package has a few functions that implement the Quine-McCluskey algorithm, adapted to social sciences by Charles Ragin (as describes in his book from 1987 The Comparative Method). Future versions of this package will have more functions to address the fuzzy-set minimization problems, as well. Big thanks to the r-help list members, supportive as ever, especially to Gabor Grothendieck and Martin Maechler for excellent ideas in the key parts of the algorithm. -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] yet another vectorization question
On Tuesday 31 January 2006 06:55, Gabor Grothendieck wrote: On 1/30/06, Patricia J. Hawkins [EMAIL PROTECTED] wrote: [...snip...] #which generalizes to: bb - matrix(1:50, ncol=10, nrow=5, byrow=TRUE) bv - as.vector(bb) ai - as.vector(aa) + rep((1:nrow(aa)-1)*10, each=3) bv[ai] - c(0,1,0) bb - matrix(bv, ncol=10, nrow=5, byrow=TRUE) bb Try this: bb - matrix(NA, ncol=10, nrow=5) bb[cbind(c(col(aa)), c(aa))] - c(0,1,0) Thank you very much both, I had a very good time exercising your solutions. The col fuction especially is useful (and insightful). I wrote a working solution based on this type of matrix indexing, which is... unfortunately... _slower_ than the for loop, especially in large loops. It seems that creating the necessary row and column indexes to cbind is much slower than copying chunks of data at certain columns: library(combinat) # for the combn function system.time(all.expr(LETTERS[1:12])) [1] 6.12 0.39 6.54 0.00 0.00 system.time(all.expr2(LETTERS[1:12])) [1] 8.62 0.27 8.91 0.00 0.00 If anyone interested, I uploaded both functions here: http://www.roda.ro/all.expr.R Thank you, Adrian -- Adrian DUSA Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] yet another vectorization question
Dear R-helpers, I'm trying to develop a function which specifies all possible expressions that can be formed using a certain number of variables. For example, with three variables A, B and C we can have - presence/absence of A; B and C - presence/absence of combinations of two of them - presence/absence of all three A B C 1 0 2 1 3 0 4 1 5 0 6 1 7 0 0 8 0 1 9 1 0 10 1 1 11 0 0 12 0 1 13 1 0 14 1 1 15 0 0 16 0 1 17 1 0 18 1 1 19 0 0 0 20 0 0 1 21 0 1 0 22 0 1 1 23 1 0 0 24 1 0 1 25 1 1 0 26 1 1 1 My function (pasted below) while producing the desired result, still needs some more vectorizing; in particular, I can't figure out how could one modify the element of a matrix using apply on a different matrix... To produce the above outcome, I use: all.expr(LETTERS[1:3]) all.expr - function(column.names) { ncolumns - length(column.names) return.matrix - matrix(NA, nrow=(3^ncolumns - 1), ncol=ncolumns) colnames(return.matrix) - column.names rownames(return.matrix) - 1:nrow(return.matrix) start.row - 1 all.combn - sapply(1:ncolumns, function(idx) { as.matrix(combn(ncolumns, idx)) }, simplify=FALSE) for (j in 1:length(all.combn)) { idk - all.combn[[j]] tt - matrix(NA, ncol=nrow(idk), nrow=2^nrow(idk)) for (i in 1:nrow(idk)) { tt[,i] - c(rep(0, 2^(nrow(idk) - i)), rep(1, 2^(nrow(idk) - i))) } ## This is _slow_ part, where I don't know how to vectorize: for (k in 1:ncol(idk)) { end.row - start.row + nrow(tt) - 1 return.matrix[start.row:end.row, idk[ , k]] - tt start.row - end.row + 1 } ## How can one modify return.matrix using apply on idk? } return.matrix[is.na(return.matrix)] - return.matrix } } Thank you in advance, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] yet another vectorization question
Adrian DUSA adi at roda.ro writes: I'm trying to develop a function [...snip...] Sorry for the traffic, I forgot to say that I'm using library(combinat) for the combn function... Thank you, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] yet another vectorization question
On Monday 30 January 2006 14:40, Philippe Grosjean wrote: Hello, Not exactly the same. By the way, why do you use do.call()? Couldn't you do simply: expand.grid(split(t(replicate(3, c(0, 1, NA))), 1:3)) Best, Philippe Grosjean Jacques VESLOT wrote: this looks similar: do.call(expand.grid,split(t(replicate(3,c(0,1,NA))),1:3)) Sigh, what a pity. It is indeed not the same... So close to a one-liner though. I come back to my original question: is it possible to modify the content of a matrix, using apply on a different matrix? In my original function, the slow part is: ## ... for (k in 1:ncol(idk)) { end.row - start.row + nrow(tt) - 1 return.matrix[start.row:end.row, idk[ , k]] - tt start.row - end.row + 1 } ## ... I'd like to use apply on the idk matrix (to get rid of the for loop) and write the contents of tt in the result.matrix... Best, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] yet another vectorization question
On Monday 30 January 2006 14:40, Philippe Grosjean wrote: Hello, Not exactly the same. By the way, why do you use do.call()? Couldn't you do simply: expand.grid(split(t(replicate(3, c(0, 1, NA))), 1:3)) Just for the sake of it, the above can be even more simple with: expand.grid(lapply(1:3, function(x) c(0, 1, NA))) Best, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] yet another vectorization question
On Monday 30 January 2006 21:44, Patrick Burns wrote: I tried to let this pass, but failed: lapply(1:3, function(x) c(0, 1, NA)) might more clearly be written as rep(list(c(0, 1, NA)), 3) Indeed! Excellent, thanks :) Hmm, I was just thinking perhaps my first example was too cluttered to spot an immediate solution. With your permission, I came up with a simpler example (I hope I don't upset anybody being too persistent): set.seed(5) aa - matrix(sample(10, 15, replace=T), ncol=5) bb - matrix(NA, ncol=10, nrow=5) for (i in 1:ncol(aa)) bb[i, aa[, i]] - c(0, 1, 0) Is there any possibility to vectorize this for loop? (sometimes I have hundreds of columns in the aa matrix) Many big thanks in advance, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie example code question
Ted.Harding at nessie.mcc.ac.uk writes: [...] The solution I finally opted for, and still use, is based (in a Linux environment) on including the following code in your .Rprofile file: .xthelp - function() { tdir - tempdir() pgr - paste(tdir, /pgr, sep=) con - file(pgr, w) cat(#! /bin/bash\n, file=con) cat(export HLPFIL=`mktemp , tdir, /R_hlp.XX`\n, sep=, file=con) cat(cat $HLPFIL\nxterm -e less $HLPFIL \n, file=con) close(con) system(paste(chmod 755 , pgr, sep=)) options(pager=pgr) } .xthelp() rm(.xthelp) (and it's also specific to the 'bash' shell because of the #! /bin/bash\n, but you should be able to change this appropriately). The above was posted by Roger Bivand on 27 May. [...] I also like the function, it's beautiful. I wonder if anyone could help me with the correct syntax for my bash shell (I assume this is the problem) because I get this error: Error in rm(.xthelp) : cannot remove variables from base namespace when starting R and when installing a new package. Thank you, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie example code question
On Friday 13 January 2006 17:45, Ted Harding wrote: On 13-Jan-06 Michael Friendly wrote: Ted: Your .xthelp is extremely useful, help on Linux being otherwise quite awkward to use since a pager in the same window make it hard to cut/paste examples --- where 'more' or 'less' really means 'instead of' :-) Glad you found it useful. I find it indispensable! For the record: this is not my code but Roger Bivand's, it being the one out of several suggestions on that thread which I decided to adopt. I still admire the neat way he wrapped it all up. [...snip...] I also like the function very much, but I get an annoying error when starting R or when installing a new package: Error in rm(.xthelp) : cannot remove variables from base namespace I assume it has something to do with my bash shell, but I have no idea what to do. I run I inside Kubuntu 5.10 Breezy (compiled from source). R.Version() $platform [1] i686-pc-linux-gnu $arch [1] i686 $os [1] linux-gnu $system [1] i686, linux-gnu $status [1] $major [1] 2 $minor [1] 2.1 $year [1] 2005 $month [1] 12 $day [1] 20 $svn rev [1] 36812 $language [1] R Thank you, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] more on the daisy function
Dear R-helpers, First of all, a happy new year to everyone! I succesfully used the daisy function (from package cluster) to find which two rows from a dataframe differ by only one value, and I now want to come up with a simpler way to find _which_ value makes the difference between any such pair of two rows. Consider a very small example (the actual data counts thousands of rows): input - matrix(letters[c(1,2,1,2,2,3,2,1,1,2,2,2)], ncol=3) input X1 X2 X3 1 a b a 2 b c b 3 a b b 4 b a b I am interested by the rows which differ by one value only; I easily do that with: library(cluster) distance - daisy(as.data.frame(input))*ncol(input) distance Dissimilarities : 1 2 3 2 3 3 1 2 4 3 1 2 Metric : mixed ; Types = N, N, N Number of objects : 4 The first and the third rows differ only with respect to variable V3, and the second and the fourth rows differ only with respect to variable V2. Now I want to replace the different values by an x; currently my code is: distance - as.matrix(distance) distance[!upper.tri(distance)] - NA to.be.compared - as.matrix(which(distance == 1, arr.ind=T)) logical.result - t(apply(to.be.compared, 1, function(idx) {input[idx[1], ] == input[idx[2], ]})) result - t(sapply(1:nrow(to.be.compared), function(idx) {input[to.be.compared[idx, 1], ]})) result[!logical.result] - x as.data.frame(result) V1 V2 V3 1 a b x 2 b x b I wonder if the daisy function could be persuaded to output a similar object as the dissimilarities one; it would be fantastic to also get something like: First.difference.found: 1 2 3 2 1 3 3 1 4 1 2 1 Here, 3 means the third variable (V3) that the first and third rows differ on. I could try to do that myself, but I don't know where to find the Fortran code daisy uses. Thanks for any hint, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] millions of comparisons, speed wanted
The daisy function is _very_ good! I have been able to use it for nominal variables as well, simply by: daisy(input)*ncol(input) Now, for very large number of rows (say 5000), daisy works for about 3 minutes using the swap space. I probably need more RAM (only 512 on my computer). But at least I get a result... :) For relatively small input matrices, it increased the speed by a factor of 3. Way to go! Best, Adrian On 12/16/05, Martin Maechler [EMAIL PROTECTED] wrote: I have not taken the time to look into this example, but daisy() from the (recommended, hence part of R) package 'cluster' is more flexible than dist(), particularly in the case of NAs and for (a mixture of continuous and) categorical variables. It uses a version of Gower's formula in order to deal with NAs and asymmetric binary variables. The example below look like very well matching to this problem. Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] millions of comparisons, speed wanted
Dear all, I have a 10 columns matrix which has 2^10=1024 unique rows, with values 0 and 1. What I would like to do is a little complicated; in a simple statement, for a subset (say 1000 rows) I want to perform pairwise comparisons between all rows, find out which rows differ by only one value, replace that value by x, get rid of the comparisons that differ by more than one value and repeat the algorithm until no further minimization is possible. Any row that hasn't been minimized is kept for the next iteration. For 1,000 rows, there are almost 500,000 pairs, but in the next iterations I could get some 5,000 rows which generates something like 12.5 millions pairs, and that takes a _very_ long time. The code I have created (10 lines, below) is super fast (using vectorization) but only for a reasonable number of rows. I am searching for: - ways to improve my code (if possible) - ideas: create a C code for the slow parts of the code? use MySQL? other ways? As a toy example, having an input matrix called input, my algorithm looks like this: ## code start ncolumns - 6 input - bincombinations(ncolumns) # from package e1071 # subset, let's say 97% of rows input - input[sample(2^ncolumns, round(2^ncolumns*0.97, 0), ] minimized - 1 while (sum(minimized) 0) { minimized - logical(nrow(input)) to.be.compared - combn2(1:nrow(input)) # from package combinat # the following line takes _a lot_ of time, for millions of comparisons logical.result - apply(to.be.compared, 1, function(idx) input[idx[1], ] == input[idx[2], ]) compare.minimized - which(colSums(!logical.result) == 1) logical.result - logical.result[, compare.minimized] result - sapply(compare.minimized, function(idx) input[to.be.compared[idx, 1], ]) result[!logical.result] - x minimized[unique(as.vector(to.be.compared[compare.minimized, ]))] - TRUE if (sum(minimized) 0) { input - rbind(input[!minimized, ], unique(t(result))) } } ## code end Any suggestion is welcomed, thank you very much in advance. Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help on a matrix task
On Tuesday 06 December 2005 14:41, JeeBee wrote: [...] N = 4 input_numbers = seq((2^N)-1, 0, -1) # convert to binary matrix input_mat = NULL for(i in seq(N-1,0,-1)) { new_col = input_numbers %% 2 input_mat = cbind(new_col, input_mat) input_numbers = (input_numbers - new_col) / 2 } colnames(input_mat) = NULL A little late, but wouldn't be more simple to create input_mat with: N - 4 input_mat - matrix(NA, ncol=N, nrow=2^N) for (i in 1:N) input_mat[,i] - c(rep(0, 2^(N - i)), rep(1, 2^(N - i))) HTH, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] attributes of a data.frame
Dear all, I noticed that a data.frame has four attributes: - names - row.names - class - variable.labels While one can use the first three (i.e. names(foo) or class(foo)), the fourth one can only be used via: attributes(foo)$variable.labels (which is kind of a tedious thing to type) Is it or would be possible to simply use: variable.labels(foo) like the first three attributes? I tried: varlab - function(x) attributes(x)$variable.labels but then I cannot use this to assign a specific label: varlab(foo)[1] - some string Error: couldn't find function varlab- Thank you, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] attributes of a data.frame
On Monday 21 November 2005 22:41, Duncan Murdoch wrote: [...snip...] Not all dataframes have the variable.labels attribute. I'm guessing you've installed some contributed package to add them, or are importing an SPSS datafile using read.spss. So don't expect varlab() or variable.labels() function to be a standard R function. Aa-haa... of course you are right: I read them via read.spss. I understand. Now, just to the sake of it, would it be wrong to make it standard? Is there a special reason not to? If you want to define it, definitions like this should work (but I can't test them): varlab - function(foo) attr(foo, variable.labels) varlab- - function(foo, label, value) { attr(foo, variable.labels)[label] - value foo } Use them like this: varlab(x) # to see the labels varlab(x, varname) - label # to set one Duncan Murdoch Thank you for the tip; I'll certainly use it. Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] attributes of a data.frame
On Monday 21 November 2005 23:00, Duncan Murdoch wrote: [...snip...] I think it's just that the R core developers don't see the need for them. If something is worth documenting, then you should write an .Rd file or a vignette about it, and that gives you more flexibility than a one line label. I think there are definitely developers out there who disagree with this point of view, and I'm pretty sure I've seen a contributed package that offered support for this, but I can't remember which one right now. So that's another reason why it's not in the base: it doesn't need to be, you can just go find and install that contributed package! Duncan Murdoch I got it, it's logic. Well, one could always use Hmisc which does very well these things. Thank you again, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help with apply, please
Dear list, I have a problem with a toy example: mtrx - matrix(c(1,1,0,1,1,1,0,1,1,0,0,1), nrow=3) rownames(ma) - letters[1:3] I would like to determine which is the minimum combination of rows that covers all columns with at least a 1. None of the rows covers all columns; all three rows clearly covers all columns, but there are simpler combinations (1st and the 3rd, or 2nd and 3rd) which also covers all columns. I solved this problem by creating a second logical matrix which contains all possible combinations of rows: tt - matrix(as.logical(c(1,0,0,0,1,0,0,0,1,1,1,0,1,0,1,0,1,1,1,1,1)), nrow=3) and then subset the first matrix and check if all columns are covered. This solution, though, is highly inneficient and I am certain that a combination of apply or something will do. ### possibles - NULL length.possibles - NULL ## I guess the minimum solution is has half the number of rows guesstimate - floor(nrow(tt)/2) + nrow(tt) %% 2 checked - logical(nrow(tt)) repeat { ifelse(checked[guesstimate], break, checked[guesstimate] - TRUE) partials - as.matrix(tt[, colSums(tt) == guesstimate]) layer.solution - logical(ncol(partials)) for (j in 1:ncol(partials)) { if (length(which(colSums(mtrx[partials[, j], ]) 0)) == ncol(mtrx)) { layer.solution[j] - TRUE } } if (sum(layer.solution) == 0) { if (!is.null(possibles)) break guesstimate - guesstimate + 1 } else { for (j in which(layer.solution)) { possible.solution - rownames(mtrx)[partials[, j]] possibles[[length(possibles) + 1]] - possible.solution length.possibles - c(length.possibles, length(possible.solution)) } guesstimate - guesstimate - 1 } } final.solution - possibles[which(length.possibles == min(length.possibles))] ### More explicitely (if useful) it is about reducing a prime implicants chart in a Quine-McCluskey boolean minimisation algorithm. I tried following the original algorithm applying row dominance and column dominance, but (as I am not a computer scientist), I am unable to apply it. If you have a better solution for this, I would be gratefull if you'd share it. Thank you in advance, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with apply, please
On Saturday 19 November 2005 17:24, Gabor Grothendieck wrote: [...snip...] Although the above is not wrong I should have removed the rbind which is no longer needed and simplifying it further, as it seems that lp will do the rep for you itself for certain arguments, gives: lp(min, rep(1,3), t(mtrx), =, 1)$solution # 1 0 1 Thank you Gabor, this solution is superbe (you never stop amazing me :) Now... it only finds _one_ of the multiple minimum solutions. In the toy example, there are two minimum solutions, hence I reckon the output should have been a list with: [[1]] [1] 1 0 1 [[2]] [1] 0 1 1 Also, thanks to Duncan and yes, I do very much care finding the smallest possible solutions (if I correctly understand your question). It seems that lp function is very promising, but can I use it to find _all_ minimum solutions? Adrian -- Adrian DUSA Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 050025 Bucuresti sectorul 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with apply, please
On Saturday 19 November 2005 19:17, Patrick Burns wrote: [snip...] One cheat would be to do the LP problem multiple times with the rows of your matrix randomly permuted. Assuming you keep track of the real rows, you could then get a sense of how many solutions there might be. Thanks for the answer. The trick does work (i.e. it finds all minimum solutions) provided that I permute the rows a sufficient number of times. And I have to compare each solution to the existing (unique) ones, which takes a lot of time... In your experience, what would be the definiton of multiple times for large matrices? My (dumb) solution is guaranteed to find all possible minimums, because it checks every possible combination. For large matrices, though, this would be really slow. I wonder if that could be vectorized in some way; before the LP function, I was thinking there might be a more efficient way to loop over all possible columns (using perhaps the apply family). Thanks again, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with apply, please
Dear Ted, On Saturday 19 November 2005 20:51, Ted Harding wrote: [...snip...] There is bound to be a good algorithm out there somewhere for finding a minimal coveriung set but I don't know it! Comments? Best wishes to all, Ted. My case is probably a subset of your general algorithm. Peaking in the computer science webpages for Quine-McCluskey algorithm, I learned that there are way to simplify a matrix (prime implicants chart) before trying to find the minimum solutions. For example: 1. Row dominance 0 0 1 1 0 0 0 1 1 1 0 0 The second row containes all elements that the first row contains, therefore the first row (dominated) can be droped 2. Column dominance 0 1 0 1 1 1 1 1 0 0 The second column dominates the first column, therefore we can drop the second (dominating) column In a Quine-McCluskey algorithm, the number of rows will always be much lower than the number of columns, and applying the two above principles will make the matrix even more simple. There are algorithms written in other languages (like Java) freely available on the Internet, but I have no idea how to adapt them to R. Best, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with apply, please
On Saturday 19 November 2005 20:51, Ted Harding wrote: [..snip...] There is bound to be a good algorithm out there somewhere for finding a minimal coveriung set but I don't know it! Best wishes to all, Ted. I found this presentation very explicit: http://www.cs.ualberta.ca/~amaral/courses/329/webslides/Topic5-QuineMcCluskey/sld079.htm Best wishes, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with apply, please
On Saturday 19 November 2005 22:09, Gabor Grothendieck wrote: Getting back to your original question of using apply, solving the LP gives us the number of components in any minimal solution and exhaustive search of all solutions with that many components can be done using combinations from gtools and apply like this: library(gtools) # needed for combinations soln - lp(min, rep(1,3), rbind(t(mtrx)), rep(=, 4), rep(1,4))$solution k - sum(soln) m - nrow(mtrx) combos - combinations(m,k) combos[apply(combos, 1, function(idx) all(colSums(mtrx[idx,]))),] In the example we get: [,1] [,2] [1,]13 [2,]23 which says that rows 1 and 3 of mtrx form one solution and rows 2 and 3 of mtrx form another solution. I'm speechless. It is exactly what I needed. A billion of thanks! Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] returning a modified fix()-ed dataframe
Dear all, In order to ease the transition from SPSS to R for some of my colleagues, I am trying to create a function which would show the variables and their labels (if those exist), using function label in package Hmisc. A toy example would be this: my.data - data.frame(age=c(24,35,28), gender=c(Male, Female, Male)) require(Hmisc) label(my.data$age) - Respondent's age label(my.data$gender) - Responent's gender variables - function(x) { dataf - data.frame(variable=NA, label=NA) varlab - NA for (i in 1:length(names(x))) { dataf[i,1] - names(x)[i] dataf[i,2] - label(x[,i]) varlab[i] - label(x[,i]) } fix(dataf) # I assume this would return a modified dataf for (i in which(varlab != dataf[,2])) { label(x[,i]) - dataf[i,2] } } Now, say during fix() one modified Responent's gender into Respondent's gender (the previous missed a d). The trouble I'm having is to return the modified object, with the modified labels. It should be easy, I feel it, but I just can't get it. Thank you in advance, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] returning a modified fix()-ed dataframe
On Friday 07 October 2005 20:55, Sundar Dorai-Raj wrote: Adrian DUSA wrote: [...snip...] Hi, Adrian, You need to assign fix(dataf) to something: my.data - data.frame(age=c(24,35,28), gender=c(Male, Female, Male)) require(Hmisc) label(my.data$age) - Respondent's age label(my.data$gender) - Responent's gender variables - function(x) { dataf - data.frame(variable=NA, label=NA) varlab - NA for (i in 1:length(names(x))) { dataf[i,1] - names(x)[i] dataf[i,2] - label(x[,i]) varlab[i] - label(x[,i]) } dataf - fix(dataf) # I assume this would return a modified dataf for (i in which(varlab != dataf[,2])) { label(x[,i]) - dataf[i,2] } # don't forget to return dataf dataf } variables(my.data) HTH, --sundar Hi Sundar, Hm... the new function correctly returns (and prints) dataf but I need to return my.data... I also thought about returning my.data, but if this would be a large dataframe, printing it wouldn't be so nice. Basically, I would need to somehow silently return the input (modified) dataframe. Also, I now have another related two questions: 1. is it possible to edit the contents of a cell when using fix() ? In order to change a letter one would have to change the whole string. 2. is it possible to change the width of a column? Thank you, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] returning a modified fix()-ed dataframe
On Friday 07 October 2005 21:22, Sundar Dorai-Raj wrote: [...snip...] My guess is you want to have your function fix my.data without having to reassign it. The answer to that question is most likely a road you do not want to travel. Otherwise, try searching the archives for assign reference for some clues. Unfortunately, this is the road I was looking for :) I'll have a look on the archives on that, thanks for the tip. Also, I now have another related two questions: 1. is it possible to edit the contents of a cell when using fix() ? In order to change a letter one would have to change the whole string. Double-click on the field. 2. is it possible to change the width of a column? Right-click on the column then select auto-size column. Or left-click on a column line and drag to desired width. Oh, silly me. I knew this works under Windows, but I should have specified I run R under Linux (Kubuntu 5.04, KDE). And I also knew that the Linux interface is not as developed as the Windows one, so it's probably not possible yet. Thanks for everything, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] measurement unit
Dear R-list, Could anybody tell me where to find information about changing the measurement unit from inch to centimeters? I read the help from X11, I read R-intro and I did some searhing in the R archives, but I couldn't find the answer. For example, I would like to produce a plot of a certain width and height: X11(width=10, height=5) and I would like these to be centimeters, rather than inches. Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] measurement unit
On Friday 09 September 2005 13:03, Christophe Declercq wrote: Adrian, 1 inch = 2.54 cm So you could try what I do for that X11(width=10/2.54, height=5/2.54) Or cm2in-function(x) x/2.54 X11(width=cm2in(10), height=cm2in(5)) HTH Christophe Thank you Cristophe, This solves it. I also thought about transforming, but I was curious if there's an already built in argument. Best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] producing SVG files
Adrian DUSA dusa.adrian at gmail.com writes: I am trying to use the RSvgDevice package to produce some SVG graphs which I want to edit with Inkscape 0.42. [...snip...] Argh, a minute after posting a find out the solution here: http://www.stat.auckland.ac.nz/~paul/Talks/gridSVG/slide8.html It works brilliant. Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] producing SVG files
I am trying to use the RSvgDevice package to produce some SVG graphs which I want to edit with Inkscape 0.42. Under Linux (Kubuntu 5.04) I use the following: library(RSvgDevice) plot(1:10, 1:10) devSVG(file = /home/adi/Rplots.svg, width = 10, height = 8, bg = white, fg = black, onefile=TRUE, xmlHeader=TRUE) but when I tried to load the file into Inkscape it complained about finding an empty file. Then I tried the example by: devSVG() plot(1:11,(-5:5)^2, type='b', main=Simple Example Plot) dev.off() and then again devSVG(file = /home/adi/Rplots.svg, width = 10, height = 8, bg = white, fg = black, onefile=TRUE, xmlHeader=TRUE) with the same result. Could you please point me to the right direction, please? Thank you in advance, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] texture in barplots?
On Thursday 14 July 2005 00:51, Peter Dalgaard wrote: Adrian Dusa [EMAIL PROTECTED] writes: ...snip... This comes up every now and then, and while it seems that everyone thinks fill patterns would be nice to have, I suspect that every attempt to actually implement it have gotten killed in infancy. The thing that is tricky to design right is the cross-device issues. Only some devices support this at all, and when they do, the patterns tend to be device dependent too. Probably not impossible -- there are other bits of the device drivers that deal with missing capabilities, like string rotation and clipping -- just, well, tricky. All clear then; I'll try to find some colors that are different even in BW, or maybe some grays. Um... Romania, I suppose? What city? It's Bucharest; I have a different signature in English but I do forget to use it when sending e-mails abroad (bad habit). Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive University of Bucharest, Romania 1, Schitu Magureanu Bd. Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 -- This message was scanned for spam and viruses by BitDefender. For more information please visit http://linux.bitdefender.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] texture in barplots?
Dear R list, For some reason I am unable to access neither search.r-project.org, nor http://finzi.psych.upenn.edu/ so I cannot search the archives for a possible answer (I Googled for this but didn't find anything). Is it possible to draw barplots using a texture instead of colors, for a black and white printer? TIA, Adrian -- Adrian Dusa Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 -- This message was scanned for spam and viruses by BitDefender. For more information please visit http://linux.bitdefender.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] texture in barplots?
On Wednesday 13 July 2005 17:36, Knut Krueger wrote: Adrian Dusa schrieb: Is it possible to draw barplots using a texture instead of colors, for a black and white printer? barplot(height,.,density=c(4,6,8,10) ...) for each bar one number - this example is for a barplot with 4 bars. with regards Knut Krueger http://www.biostatistic.de Thank you, I read about density but they only seem to draw diagonal lines (differing in the number of lines per inch). I am looking for different *types* of texture (i.e. maybe I could reverse the shading lines, or cross-lines or something like that). All the best, Adrian -- Adrian Dusa Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 -- This message was scanned for spam and viruses by BitDefender. For more information please visit http://linux.bitdefender.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] not supressing leading zeros when reading a table?
Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy 00100 2 rrr hhh 01 When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the answers using formatC but there are over a hundred different questions in there. I tried with quote=\' without any luck. Googling after this take me nowhere either. It should be simple but I seem to miss it... Can anybody point me to the right direction? TIA, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] not supressing leading zeros when reading a table?
On 7/10/05, alejandro munoz [EMAIL PROTECTED] wrote: Adrian, To prevent coercion to numeric, try: mydata - read.table(myfile, colClasses=character) HTH. alejandro On 7/10/05, Adrian Dusa [EMAIL PROTECTED] wrote: Dear R list, [...snip...] Thank you all, I got it. This is my favourite super fast ever helpful help list (gosh, I didn't even expect an answer Sundays at 10 pm! ). Best, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [spam] t.test confidence interval
Hi, I'm using R for some undergraduate lectures, reaching the t tests. No matter what conf.level one specifies in the syntax, the output always shows the 95 percent confidence interval. Is it possible to alter the function somehow, to report the CL percent confidence interval? TIA, Adrian -- Adrian Dusa Romanian Social Data Archive Bd. Schitu Magureanu nr.1 Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 -- This message was scanned for spam and viruses by BitDefender. For more information please visit http://linux.bitdefender.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] [spam] t.test confidence interval
On 15 Apr 2005 13:53:32 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote: Adrian Dusa [EMAIL PROTECTED] writes: Hi, I'm using R for some undergraduate lectures, reaching the t tests. No matter what conf.level one specifies in the syntax, the output always shows the 95 percent confidence interval. That's just not true in my version of R. You're right of course. I just typed cl instead of conf or conf.level and it took the predefined 0.95. Sorry to bothering you. t.test(extra ~ group, data = sleep, conf=.8) Welch Two Sample t-test data: extra by group t = -1.8608, df = 17.776, p-value = 0.0794 alternative hypothesis: true difference in means is not equal to 0 80 percent confidence interval: -2.7101645 -0.4498355 sample estimates: mean in group 1 mean in group 2 0.752.33 Is it possible to alter the function somehow, to report the CL percent confidence interval? -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] z and p
Dear useRs, I use pnorm to calculate the area under the normal curve to the left of z. Now, is there a function which provides the z value given a certain area? I wrote a function which finds it in about 20 iterations, but it seems to me not the best solution; I'm just curios if there is an already built function. Regards, Adrian -- Adrian Dusa Romanian Social Data Archive Bd. Schitu Magureanu nr.1 Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 -- This message was scanned for spam and viruses by BitDefender. For more information please visit http://linux.bitdefender.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics
John Dougherty jwd at surewest.net writes: On Thursday 24 February 2005 04:13, Adrian Dusa wrote: ... You need to check your font installation. Be sure the X-11 fonts are installed. XFree86-fonts-75dpi-4.3.99.902-30 XFree86-fonts-100dpi-4.3.99.902-30 Should both be on your system. If they aren't bring up the YaST control center and select Install and Remove Software. You can use the search option to filter for packages that have fonts in their descritpion. Install any that aren't. SuSE seems to be a little funny about the X-11 fonts. Peter Dalgaard just let me know about that a short time ago. Thank you for the info; I found that thread as well, but I seem to have both 75 and 100 dpi packages installed: xorg-x11-fonts-100dpi version 6.8.1-15 xorg-x11-fonts-75dpi version 6.8.1-15 Googling around, I wasn't able to find any XFree86 .rpm fonts package. I am also playing with the UTF-8 locales, it may have something to do with this. JWDougherty __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Regards, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics
Prof Brian Ripley ripley at stats.ox.ac.uk writes: On Tue, 22 Feb 2005 Cedric.Ginestet at tvu.ac.uk wrote: The R platform that I installed on my Windows XP crashes everytime that I try to run some sophisticated graphics (e.g. Demo Graphics). Is that to do with the configuration? Shall I reinstall it? Please consult the rw-FAQ. It is likely to be a problem with your Windows installation, as R runs on literally thousands (maybe tens of thousands) of Windows XP machines. PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html which points you at the rw-FAQ. I have a similar problem; I am sure there's something I should do on my machine but I just can not figure out what. On: demo(graphics) after two enters, I get: title(main = January Pie Sales, cex.main = 1.8, font.main = 1) Error in title(main = January Pie Sales, cex.main = 1.8, font.main = 1) : X11 font at size 22 could not be loaded I read the R Installation and Administration manual, I recompiled R using all the options (e.g. --with-x), I have all the requred packages... My system: SuSE 9.2 Professional version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R Any hint would be highly appreciated, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] file 'attributes'
Dear R-list, I have many files on many CDs (probably same as many of you) and I would like to create a database containing only a few columns: - the name of the file - its extension - space occupied - date when it was created - its path - CD number (this should be typed in manually) Is it possible to read a CD/structure of folders in such a way? Thank you for any suggestion, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] further issues with install.packages
Hi again, I run R under SuSE 9.2 Professional (installed via rpm) and I am trying to install some packages from CRAN. The trouble is, after successful installations, my destdir directory is deleted...! My command: install.packages(Rcmdr, /usr/lib/R/library, CRAN=http://cran.r-project.org;, destdir=/home/adi/Kituri/R.packages, dependencies=TRUE) After each package I get: WARNING: UTF-8 locales are not currently supported Package rgl gives: ERROR: compilation failed for package 'rgl' ** Removing '/usr/lib/R/library/rgl' ** Restoring previous '/usr/lib/R/library/rgl' Warning message: Installation of package rgl had non-zero exit status in: install.packages(rgl, /usr/lib/R/library, CRAN = http://cran.r-project.org;, Trying to run Rcmdr I get: library(Rcmdr) Loading required package: tcltk Error in fun(...) : this isn't a Tk applicationcouldn't connect to display homelinux.roda.local:0.0 Error: .onLoad failed in loadNamespace for 'tcltk' Error: package 'tcltk' could not be loaded I have the tcl and tk packages installed under SuSE. The command whereis g77 gives: g77: /usr/bin/g77 /usr/share/man/man1/g77.1.gz R.version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R Could you please advice? Thank you, Adrian -- Adrian Dusa Arhiva Romana de Date Sociale Bd. Schitu Magureanu nr.1 Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] file 'attributes'
Liaw, Andy andy_liaw at merck.com writes: file.info() should help with some of those items. Andy From: Adrian Dusa Dear R-list, I have many files on many CDs (probably same as many of you) and I would like to create a database containing only a few columns: - the name of the file - its extension - space occupied - date when it was created - its path - CD number (this should be typed in manually) Is it possible to read a CD/structure of folders in such a way? Thank you for any suggestion, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 Brilliant, Andy. This is exactly what I needed. Now, if I could find a way to scan a structure of folders, I could store the information about every file in every folder in a table or a database... Using ?file.info, I found another bunch of useful functions, like file.path, list.files etc. It looks very promising. I know there is a command in Linux called 'file', which determines the file type (a substitute for the extension). I wonder if I could use that command via R. Thanks again, Adrian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Tuning string matching
Oh this is excellent, Stephen. Now I can create a code to break the initial strings: name1 - Harry Harrington name2 - Harington Harry str1 - unlist(strsplit(name1, )) str2 - unlist(strsplit(name2, )) str1 [1] Harry Harrington str2 [1] Harington Harry and compare the words using your function. Brilliant. Cheers, Adrian Quoting Stephen Upton [EMAIL PROTECTED]: Adrian, As an exercise, I took the pseudocode on the wiki pages for the Levenshtein distance and programmed it in R. The code is below. I tested it for just 2 strings, so I'm not claiming that it *really* works, but it seems to. As you can see, I didn't add any error checking, and there is likely some cool R shortcuts that could be added. As to your problem, I'd also suggest that you might want to apply the below function to possible combinations of words rather than attempting to apply the function to a complete name; that should alleviate the first.name last.name, last.name first.name problem. levenshtein.distance(Harrington,Harington) [1] 1 levenshtein.distance(Harrington Harry,Harry Harington) [1] 11 HTH Steve levenshtein.distance - function(string.1, string.2, subst.cost=1) { c1 - strsplit(string.1,split=)[[1]] c2 - strsplit(string.2,split=)[[1]] n - length(c1) m - length(c2) d - array(0,dim=c(n+1,m+1)) d[,1] - 1:(n+1) d[1,] - 1:(m+1) d[1,1] - 0 for (i in 2:(n+1)) { for (j in 2:(m+1)) { if (c1[i-1] == c2[j-1]) cost - 0 else cost - subst.cost d[i,j] - min(d[i-1,j] + 1,# insertion d[i,j-1] + 1,# deletion d[i-1,j-1] + cost) # substitution } } d[n+1,m+1] } -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, January 06, 2005 4:32 AM To: Jonathan Baron Cc: McGehee, Robert; bogdan romocea; r-help@stat.math.ethz.ch Subject: Re: [R] Tuning string matching Thank you all for your replies. Indeed I used: http://finzi.psych.upenn.edu/nmz.html as a search site. I used string match, instead of fuzzy string match. Fuzzy matching seems to me a rather complicated matter, whereas my initial idea about solving this problem was a bit simpler: - check all characters in both strings (a 2 dimensional matrix of characters) - if 90% (or any other percent) of the characters in both strings are similar (in terms of distances between each character from the first string to all characters from the second string), then the two strings will be declared as a match I just found out that this algorithm is called the Levenshtein distance, and I know there is a PHP function called levenshtein (I thought it already might have been implemented in R). For anyone that have a clue on how to read this stuff: http://ro.php.net/levenshtein I tried to use agrep: agrep(Harry Harrington, Harry Harrington) [1] 1 agrep(Harry Harrington, Harrington Harry) numeric(0) So it seems not to be what I'm looking for (I'll try harder with edit distance, though) Best regards, Adrian Quoting Jonathan Baron [EMAIL PROTECTED]: Sorry for joining late, but I wanted to see if my search page could help. (I don't know which search archive you looked at.) I entered fuzzy string match* and got a few things that look relevant, including the agrep function. As for the second part of the question, that seems to be a coding problem that is dependent on the current form of your data. Write me off the list and I'll send you an R script I use for similar things (making PayPal payments). Jon Dear list, I spent about two hours searching on the message archive, with no avail. I have a list of people that have to pass an on-line test, but only a fraction of them do it. Moreover, as they input their names, the resulting string do not always match the names I have in my database. I would like to do two things: 1. Match any strings that are 90% the same Example: name1 - Harry Harrington name2 - Harry Harington I need a function that would declare those strings as a match (ideally having an argument that would allow introducing 80% instead of 90%) -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html ~~ Adrian Dusa ([EMAIL PROTECTED]) Romanian Social Data Archive (www.roda.ro) 1, Schitu Magureanu Bd. 010181 Bucharest sector 5 Romania Tel./Fax: +40 (21
RE: [R] console under Mandrake
Thank you for your both answers, As you might have guessed, I am initiating myself in the Linux wizardry. I have absolutely nothing against command line, I was just heavily used to the Windows console mode; a terminal window is just fine. Actually, I only used the console to install packages from CRAN, anyway; I'm sure I'll find the commands for this. I read the manuals; it probably didn't work for me because I do not use GNOME but KDE. So command line it is... and most probably ESS. Best regards, Adrian -Original Message- From: Peter Dalgaard [mailto:[EMAIL PROTECTED] Sent: 22 octombrie 2004 19:11 To: Adrian Dusa; [EMAIL PROTECTED] Subject: Re: [R] console under Mandrake Prof Brian Ripley [EMAIL PROTECTED] writes: On Fri, 22 Oct 2004, Adrian Dusa wrote: I recently compiled R 2.0.0 under Mandrake 9, but it won't run unless in a terminal; is there a way to run it in a console, like in Windows? Yes. Did you read the manual the INSTALL file pointed you to? See appendix B.6 in the version I am looking at. Or see `An Introduction to R' appendix B.1 and look for --gui. [To run the GNOME console I think you need R-patched, not 2.0.0 as distributed.] Another approach is to run John Fox's Rcmdr package that provides a console. Actually, that one is more of a script submission device. (It has nowhere to just type and press Enter, you need to edit, select, and press Submit. Which is a good thing for some modes of operation.) You'll find that the Linux consoles are not nearly as developed as the Windows one and hardly anyone is using --gui. There are two good reasons: 1) It's really not that horrible to use the command line in a terminal window on Linux. 2) Many people like to use ESS (see the FAQ) and run everything from Emacs. and of course the bad reason: That there isn't much there. However, had there been a real need, someone would likely have put in the relevant improvements. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] one more Rcmdr problem
Brilliant. With R 2.0.0 patched and Rcmdr 0.9-12 all problems are gone. Thank you all, and a special thank you for the R Commander; it saves a lot of effort for teaching purposes. Best regards, Adrian -Original Message- From: Erich Neuwirth [mailto:[EMAIL PROTECTED] Sent: 15 octombrie 2004 18:27 To: Adrian Dusa Cc: [EMAIL PROTECTED] Subject: Re: [R] one more Rcmdr problem I did experience the same problem. After installing R 2.0.0 patched and downloading the source for Rcmdr_0.99-12 from John Fox's Web page http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/ and recompiling the package it now works. Ctrl-C, Ctrl-X, and Ctrl-V, in the text boxes of Rcmdr now work. Adrian Dusa wrote: Hello, I'm using R 2.0.0 with the latest Rcmdr package installed from CRAN, on Windows XP Professional. When trying to copy some commands or results, either from the upper or lower text window, this causes Rcmdr to crash: R for Windows GUI front-end has encountered a problem and needs to close Did anyone have the same problem? I don't think it's my system, as it happened to reinstall my Windows just a few days ago, and the same problem occurred in the former one. Regards, Adrian Adrian Dusa Romanian Social Data Archive 1 Schitu Magureanu Bd. 050025 Bucharest sector 5 Tel. +40 21 3126618\ +40 21 3153122/ int.101 -- Erich Neuwirth, Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-38624 Fax: +43-1-4277-9386 -- This message was scanned for spam and viruses by BitDefender For more information please visit http://linux.bitdefender.com/ -- This message was scanned for spam and viruses by BitDefender For more information please visit http://linux.bitdefender.com/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html