Re: [R] non-linear integer optimization?
darckeen darckeen at hotmail.com writes: This is an example of the type of problem, and how i'm currently using optim() to solve it. I would classify this as an integer programming (IP) problem, it is not even non-linear, as there are no continuous variables. Your approach with optim() will not work, especially not with method 'L-BFGS-B' for numerical optimization. Maybe it's worth to try method 'SANN'. Integer programming needs special techniques. If you don't know more about your function, it may boil down to an exhaustive discrete search that will be slow anyway. Hans Werner Example: With set.seed(8232); mydata - runif(500,-1,1) # fix these parameters the 'optim()' approach finds a minimum of 0.9887148 with L-BFGS-B and 0.9139314 with method SANN, while the global minimum is 0.7360688 in your constraint area. DEoptim (in package DEoptim) returns 0.750277. The minimum is more difficult to find in this example as it appears on the boundary. mydata - runif(500,-1,1) myfunc - function(p,d) { print(p - floor(p)) ws - function(i,n,x) sum(x[i-n+1]:x[i]) ws1 - c(rep(NA,p[1]-1),sapply(p[1]:NROW(d),ws,p[1],d)) ws2 - c(rep(NA,p[2]-1),sapply(p[2]:NROW(d),ws,p[2],d)) ws3 - c(rep(NA,p[3]-1),sapply(p[3]:NROW(d),ws,p[3],d)) var(ws1+ws2+ws3,na.rm=TRUE) } opt - optim(c(25,50,150),myfunc,method=L-BFGS-B, control=list(fnscale=-1,parscale=c(1,1,1),factr=1,ndeps=c(5,5,5)), lower=t(c(1,51,101)),upper=t(c(50,100,200)),d=mydata) print(floor(opt$par)) print(myfunc(opt$par,mydata)) So the parameters to the function to be optimized are parameters to functions that only accept integer values. All of the paramters to be optimized are integers that are subject to upper lower bound constraints. This was the solution I came up with after checking CRAN, searching nabble etc. It runs but not very efficiently, and does not seem to really explore the sample space very well before focusing on a local minimum. I also looked at the constrOptim function but I couldn't figure out how to implement two sided constraints with it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
baptiste auguie baptiste.auguie at googlemail.com writes: Dear list, I'm seeking some advice regarding a particular numerical integration I wish to perform. The integrand f takes two real arguments x and y and returns a vector of constant length N. The range of integration is [0, infty) for x and [a,b] (finite) for y. Since the integrand has values in R^N I did not find a built-in function to perform numerical quadrature, so I wrote my own after some inspiration from a post in R-help, The function adaptIntegral() in the 'cubature' package integrates multi-valued functions over n-dimensional finite hypercubes, as do the functions in 'R2Cuba'. If the hypercube is (partly) infinite, a transformation such as x -- 1/x per infinite axis (as in NR) has to be applied. For two dimensions, another approach could be to apply the integrate() function twice, because this 1-dimensional integration function can handle infinite intervals. Hint: First inegrate over the finite dimension(s). Example: Integrate sin(x)/exp(y) for 0 = x = pi, 0 = y = Inf (value: 2) f1 - function(y) 1/exp(y) f2 - function(x) sin(x) * integrate(f1, 0, Inf)$value integrate(f2, 0, pi) # 2 with absolute error 2.2e-14 Note that the absolute error is not correct, as the first integration has its own error term. You have to do your own error estimation. Hans Werner [...] So it seems to work. I wonder though if I may have missed an easier (and more reliable) way to perform such integration using base functions or an add-on package that I may have overlooked. Best regards, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
baptiste auguie baptiste.auguie at googlemail.com writes: Thanks, adaptIntegrate() seems perfectly suited, I'll just need to figure a transformation rule for the infinite limit. The suggestion of x-1/x does not seem to work here because it also transforms 0 into -infinity. I think exp(pi* sinh(x)) could be a better choice, according to Numerical Recipes. Yes, that's one way. But you can also split the integral in two parts, one from 0 to 1 and then from 1 to Inf. The first one is a finite hypercube and the second can be transformed with x -- 1/x into [0, 1]. I usually prefer the second approach for higher-dimensional applications as the Jacobian appears to be simpler. In the literature you will find discussions on how far out the finite hypercube should reach for lowering the absolute error. Hans Werner Thanks, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prime Factorization
David Winsemius dwinsemius at comcast.net writes: On Sep 21, 2010, at 11:33 AM, Cliff Clive wrote: Hi everyone, I have a very quick question: Is there a ready-made function in R or any R packages to find the prime factorization of an integer? Yes. At least two. The obvious search strategy with your favrite search tool should work well. I don't know which prime factorization functions David means --- sometimes these search tips are quite a mystery to me. You didn't tell us an important parameter, namely the size of integers you want to factorize. For larger numbers there is just one such function that can be taken into consideration, i.e. factorize() in the multiple precision package 'gmp'. library(gmp) factorize(2^32+1) # [1] 641 6700417 And better do not use the 'schoolmath' package, this package is of very low quality and probably should be removed from CRAN. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bivariate vector numerical integration with infinite range
baptiste auguie baptiste.auguie at googlemail.com writes: Thanks. I am having trouble getting adaptIntegrate to work with a multivalued integrand though, and cannot find a working example. Anyone had better luck with it? The function to be integrated needs a vector as input: f - function(x) { res - 1 / (sqrt(x[1])*(1+x[1])) c(res, res/2, 2*res) } adaptIntegrate(f, lowerLimit=c(0.1, 0), upperLimit=c(10, 1), fDim = 3) $integral [1] 1.9164832 0.9582416 3.8329665 $error [1] 1.265252e-05 6.326261e-06 2.530504e-05 $functionEvaluations [1] 323 $returnCode [1] 0 Hans Werner library(cubature) f - function(x, y) { + res - 1 / (sqrt(x)*(1+x)) + c(res, res/2, 2*res) + } adaptIntegrate(f, lowerLimit=c(0.1, 0), upperLimit=c(10, 1), fDim = 3) [1] adaptIntegrate: Error in evaluation function f(x) for x= res [1,] 0.07355275 0.03677638 0.1471055 [2,] 0.94280904 0.47140452 1.8856181 Error in adaptIntegrate(f, lowerLimit = c(0.1, 0), upperLimit = c(10, : adaptIntegrate: Result f(x) is not numeric or has wrong dimension Best, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prime Factorization
David Winsemius dwinsemius at comcast.net writes: A further citation that answers the question I raised (and inaccurately predicted no value) regarding prime.sieve : http://finzi.psych.upenn.edu/R/Rhelp02/archive/49773.html This was found with Barons search facility set for rhelp postings: [...] Just being curious: And did you also find the probably most important and professional of all factorization routines in R, based on the Pollard Rho algorithm, factorize() in package 'gmp'? Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimized value worse than starting Value
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: On Wed, Sep 8, 2010 at 1:35 PM, Michael Bernsteiner dethlef1 at hotmail.com wrote: Dear all, I'm optimizing a relatively simple function. Using optimize the optimized parameter value is worse than the starting. why? I would like to stress here that finding a global minimum is not as much sorcery as this thread seems to suggest. A widely accepted procedure to provably identify a global minimum goes roughly as follows (see Chapt. 4 in [1]): - Make sure the global minimum does not lie 'infinitely' for out. - Provide estimations for the derivatives/gradients. - Define a grid fine enough to capture or exclude minima. - Search grid cells coming into consideration and compare. This can be applied to two- and higher-dimensional problems, but of course may require enormous efforts. In science and engineering applications it is at times necessary to really execute this approach. Hans Werner [1] F. Bornemann et al., The SIAM 100-Digit Challenge, 2004, pp. 79. In fact, a slightly finer grid search will succeed in locating the proper minimum; several teams used such a search together with estimates based on the partial derivatives of f to show that the search was fine enough to guarantee capture of the answer. This looks familiar. Is this some 1-d version of the Rosenbrock Banana Function? http://en.wikipedia.org/wiki/Rosenbrock_function It's designed to be hard to find the minimum. In the real world one would hope that things would not have such a pathological behaviour. Numerical optimisations are best done using as many methods as possible - see optimise, nlm, optim, nlminb and the whole shelf of library books devoted to it. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating area between plot lines
A. Marcia BARBOSA anamarciabarbosa at gmail.com writes: Hi everyone. I have these data: probClass-seq(0,0.9,0.1) prob1-c(0.0070,0.0911,0.1973,0.2949,0.3936,0.5030,0.5985,0.6869,0.7820,0.8822) prob2-c(0.0066,0.0791,0.2358,0.3478,0.3714,0.3860,0.6667,0.6400,0.7000,1.) # which I'm plotting as follows: plot(probClass,prob1,xlim=c(0,1),ylim=c(0,1),xaxs='i',yaxs='i',type=n) lines(probClass,prob1) lines(probClass,prob2) polygon(c(probClass,rev(probClass)),c(prob2,rev(prob1)),col=red,border=NA) Given that the total area of the plot is 1, how can I calculate the area between the plotted lines (red polygon)? I have only found the areapl function in the splancs package, but it doesn't work for self-intersecting polygons.. Any help will be gratefully received. Cheers, Márcia Remember that the area between two curves is the same as the integral of the difference between these two curves (resp. its absolute values); thus it is easy to compute the area directly: f1 - approxfun(probClass, prob1-prob2) # piecewise linear function f2 - function(x) abs(f1(x)) # take the positive value integrate(f2, 0, 0.9) 0.03548023 with absolute error 1.6e-06 where the true value is 0.03547927 (using the triangle/trapez area formula). Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding naive baysian classifier in R
Vishnampettai akron_aadhithya at yahoo.com writes: Hi, I have a small doubt regarding naive Bayes. I am able to classify the data's properly but i am just stuck up with how to get the probability values for naive bayes. In the case of SVM we have attr function that helps in displaying the probability values. Is there any function similar to attr in naive Bayes that can be used for displaying the attribute values. Some time ago I wrote my own naive Bayes classifier as I was unsatisfied with the speed of other implementations for larger data sets. My predict function returns an 'accuracy' with every input item based on the standard calculation. If you like I can send it to you as I am also interested in more test cases, though I did some tests comparing it with results from 'e1071::naiveBayes'. Hans Werner P. S.: What are you doing with decision.values=TRUE, probability=TRUE? I can't find those in the documentation for naiveBayes. I assume you don't get an error because the ... catches all these unused parameters. my code is given below: library(e1071) train-read.table(...,header=T); test-read.table(...,header=T); length(test); cl - c(c(rep(ALL,10), rep(AML,10))); cl - factor(cl) model - naiveBayes(t(train),cl); pred-predict(model, t(test),decision.values=TRUE,probability=TRUE) attr(pred,probabilities); table(pred,t(cl)); __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LSAP and the Hungarian algorithm [was: R-help]
venkatesh bandaru venkatesh.bandaru at gmail.com writes: If you had searched for it, you would have easily found the 'clue' package: In package clue solve_LSAP() enables the user to solve the linear sum assignment problem (LSAP) using an efficient C implementation of the Hungarian algorithm. Hans Werner Respected R Help Team Members, I am venkatesh .B , doing mtech in *University of Hyd,HYDERABAD. *i want know , is there any package that contains Hungarian algorithm, that solves linear assignment problem. thanking you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LSAP and the Hungarian algorithm [was: R-help]
Ravi Varadhan rvaradhan at jhmi.edu writes: However, The clue package has the solve_LSAP() function (as pointed out by Hans Werner) that solves the linear sum assignment problem, but it uses the Hungarian algorithm that was requested by you. The lp.assign() function in lpSolve package (as pointed out by Girish) also solves the linear sum assignment problem, but I do not know which algorithm is used. This is not documented in the help page. lp.assign() solves the linear assignment problem as a standard linear program with binary variables, see the Assignment Problem article in the Wikipedia. With 'lpSolve', this will be slower than the Hungarian algorithm in higher dimensions. But remember our thread A combinatorial optimization problem: finding the best permutation of a complex vector in Nov. 2009. Highly efficient software for integer programming -- such as ILOG's commercial CPLEX program -- will be much faster than the specialized Hungarian algorithm when the problem size grows considerably (as was shown in Erwin Kalvelagen's blog). Hans Werner Best, Ravi. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Puzzle
Ben Holt BHolt at bio.ku.dk writes: I have data similar to this: Location Surveyor Result A1 83 A2 76 A3 45 B1 71 B4 67 C2 23 C5 12 D3 34 E4 75 F4 46 G5 90 etc (5 million records in total) I need to divide the data to many subsets then randomly select 5 different locations and 5 different surveyors (one at each of the 5 randomly selected locations) for each subset. The function I have written basically picks five locations and then 1 surveyor in each location, checks that there are five different surveyors and if there isn't tries again. The problem is that for some subsets this doesn't work. Some subsets don't have enough locations/surveyors or both, but this can be checked for easily. The problem subsets do have enoughs locations and surveyors but still cannot produce 5 locations each with a different surveyor. The matrix below demonstrates such a subset: locations A B C D E 1 1 0 0 0 0 Surveyors 2 1 0 0 0 0 3 1 0 0 0 0 4 1 0 0 0 0 5 1 1 1 1 1 I cannot think of a way to check for such a situation and therefore I have simply programmed the function to give up after 100 attempts if it can't find a solution. This is not very satisfactory however as the analysis takes a very long time to run and it would also be very useful useful for me to know how many suitable solution there are. If I understand you correctly: The task of finding a maximal number of 'independent' pairs (of surveyors and locations) is called the generalized (or: maximum) linear assignment problem, see http://en.wikipedia.org/wiki/Generalized_assignment_problem and there appears not to exist an efficient algorithm. One can think of procedures to find at least 5 such pairs (or to report infeasibility), but I think with millions of records in one of those subsets any approach (using loops) will be intolerable slow. Perhaps you can rethink your original intention or try to provide additional information during the process of generating the data. Of course, I would be glad to hear more positive news as this kind of problem does show up in many discrete optimization tasks. Hans Werner I reckon some of you clever folk out there must be able to think of a better solution. Any advice appreciated, Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem to compute a function with very large numbers
Nan Zhao nzhao at student.ethz.ch writes: Thank you Dennis for your explanations! The results you found are the same as mine. with first an infinity result, followed by NaN. It seems that, when the number becomes too small, R must round it up to 0. Hence I was wondering if there might a way to increase the number of decimals for extremal computations. I have tried to use double variables, but this didn't have a bigger success. There is no doubt about the function and it orignates from a Kendall's plot in measuring dependencies. Thank you in advance, Best Regards, Nan See the thread Problem with the precision of numbers in January this year. For example in http://finzi.psych.upenn.edu/Rhelp10/2010-January/225640.html you will also find a hint how to compute binomial coefficients with 'Rmpfr'. Integration will be a problem here that you may have to replace with your own method for integration, such as Simpson's adaptive formula. Regards, Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Games
Silvano silvano at uel.br writes: Hi, I want to build the table of a football league with 11 teams. All play together. So will 55 games. Since there are an odd number of teams in each round a team will not play. The easy solution is moving around a table with one team pausing. # Playing schedule for an odd number of teams n - 5 noTeams - 2*n+1 noGames - n*noTeams teams - paste(T, 1:noTeams, sep=) rounds - numeric(noGames) team1 - team2 - character(noGames) for (i in 1:noTeams) { for (j in 1:n) { k - n*(i-1)+j rounds[k] - i team1[k] - teams[j+1] team2[k] - teams[noTeams-j+1] } teams - c(teams[2:noTeams], teams[1]) } schedule - data.frame(rounds=rounds, team1=team1, team2=team2) Hans Werner The games will be: games = urnsamples(1:11, x = c('A','B','C','D','E','F','G','H','I','J','K'), size=2, replace=F, ordered=FALSE) games As will be five games per round. How to build a table with all the championship rounds, automatically? I thought about something like: game1 = c( sample(11,2) sample(11,2) sample(11,2) sample(11,2) sample(11,2) ) but, isn't work very well. Some suggestion? -- Silvano Cesar da Costa Departamento de Estatística Universidade Estadual de Londrina Fone: 3371-4346 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why use numFold in evaluate_Weka_classifier of RWeka
s0300851 s0300851 at tp.edu.tw writes: Hi everyone, I have a question about using RWeka package, we know that instruction make_Weka_classifier that can help us to build a model,and evaluate_Weka_classifier instruction can help us to evaluate the performance of the model using on new data. But I have a question about how to using the parameter numFold in evaluate_Weka_classifier.Cross-validation means that using some parts to train our data,and some parts to do test,but it should be using in the step of building our model not evaluation. I try to think about the numFold=n in the evaluate_Weka_classifier may be this: randomly(but in proportion) to select data in the dataset then redo n times, then to evaluate the performance.Is this correct? No. It's preferable to learn about Weka right from the Weka manual. About the number of folds ('numFold') it says: A more elaborate method is cross-validation. Here, a number of folds n is specified. The dataset is randomly reordered and then split into n folds of equal size. In each iteration, one fold is used for testing and the other n-1 folds are used for training the classifier. The test results are collected and averaged over all folds. This gives the cross-validation estimate of the accuracy. Thanks. Best regards , Hsiao __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization problem with nonlinear constraint
Uli Kleinwechter u.kleinwechter at uni-hohenheim.de writes: Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. I don't think optimization is the right approach for simply inverting a simple function. The inverse of the function x \to x * e^x is the Lambert W function. So the solution in your case is: W(log(T)*y*T) / log(T) # hope I transformed it correctly. Now, how to compute Lambert's W ? Well, look into the 'gsl' package and, alas, there is the function lambert_W0. Your example: y - 3 T - 123 library(gsl) lambert_W0(log(T)*y*T)/log(T) # [1] 1.191830 Regards, Hans Werner John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hurst Exponent Estimation
Lorenzo Isella lorenzo.isella at gmail.com writes: Dear All, I am a novice when it comes to time-series analysis and at the moment I am actually interested in calculating the Hurst exponent of a time series. Some time ago I tested some of the classical chaotic time series (such as the logistic map and others, no financial time series) with available functions in R and Matlab. In my experience, Peng's method (realized in R as fArma::pengFit) works reasonably reliable and is more accurate than most others on these series. Unfortunately, the available R and Matlab implementations of the same method -- and refering back to the same literature article -- can give quite different results, with varying success for both sides. AFAIK, in TISEAN there is no function estimating the Hurst exponent. Regards Hans Werner This question has already been asked quite some time ago http://bit.ly/98dZsi and I trust some progress has been made ever since. I was able to find some functions in the packages http://cran.r-project.org/web/packages/Rwave/index.html and http://cran.r-project.org/web/packages/fArma/index.html Allegedly, there should be functions for this in the Rtisean package http://cran.r-project.org/web/packages/RTisean/index.html but I have not been able to find them. Bottom line: if you have a time series (list of empirical data of varying length and not necessarily sampled on a uniform time grid) what R tool would you use to estimate its Hurst exponent? Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hurst Exponent Estimation
Spencer Graves spencer.graves at structuremonitoring.com writes: Have you tried something like the following: library(sos) H - ???Hurst summary(H) H This identified 50 links in 15 packages, and displayed the results in a table in a web browser with links to the best match in the best package first. The installPackages and writeFindFn2xls functions are designed to produce an Excel file with a summary page that can help you identify the package that seems to be most actively maintained among the relevant packages, as explained in a vignette. Hope this helps. Spencer Graves Dear Spencer, sometimes I am a bit worried about answers that simply redirect the requestor to searching the Internet or R sites. It fuels the impression that experience is not so much asked for nowadays. Anyway, I looked into the 'fractal' package that comes up in your search. The same applies here that I said before. For none of the test series the result of computing H was nearly as accurate as what 'pengFit' returned. I really would be interested to hear from others about experiences with financial times series. Best, Hans Werner On 7/22/2010 12:25 AM, Hans W Borchers wrote: Lorenzo Isellalorenzo.isellaat gmail.com writes: Dear All, I am a novice when it comes to time-series analysis and at the moment I am actually interested in calculating the Hurst exponent of a time series. Some time ago I tested some of the classical chaotic time series (such as the logistic map and others, no financial time series) with available functions in R and Matlab. In my experience, Peng's method (realized in R as fArma::pengFit) works reasonably reliable and is more accurate than most others on these series. Unfortunately, the available R and Matlab implementations of the same method -- and refering back to the same literature article -- can give quite different results, with varying success for both sides. AFAIK, in TISEAN there is no function estimating the Hurst exponent. Regards Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double Integration
Bogaso Christofer bogaso.christofer at gmail.com writes: Hi Ravi, your suggestion helped me as well a lot. If I look into that function, I see this function is calling another function : .Call(doCubature, as.integer(fDim), body(f.check), as.double(lowerLimit), as.double(upperLimit), as.integer(maxEval), as.double(absError), as.double(tol), new.env(), PACKAGE = cubature) How I can see the interior of this doCubature? Find the original code for the 'cubature' package at http://ab-initio.mit.edu/wiki/index.php/Cubature plus information why the 'adapt' package had to be abandoned and that 'cubature' is based on the same original algorithm of Genz and Malik, but using free and GPLed software. We should not bemoan the loss of the 'adapt' package, 'cubature' and 'R2cuba' are worthy successors for adaptive quadrature. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double Integration
Sarah Sanchez sarah_sanchez09 at yahoo.com writes: Dear R helpers I am working on the Bi-variate Normal distribution probabilities. I need to double integrate the following function (actually simplified form of bivariate normal distribution) f(x, y) = exp [ - 0.549451 * (x^2 + y^2 - 0.6 * x * y) ] where 2.696 x 3.54 and -1.51 y 1.98 I need to solve something like INTEGRATE (2.696 to 3.54) dx INTEGRATE [(-1.51 to 1.98)] f(x, y) dy I have referred to stats::integrate but it deals with only one variable. This example appears in Internal Credit Risk Model by Michael Ong (page no. 160). There has been the same request last month, the answer is still valid: library(cubature) fun - function(x) exp(-0.549451*(x[1]^2+x[2]^2-0.6*x[1]*x[2])) adaptIntegrate(fun, c(-1.51,1.98), c(2.696,3.54), tol=1e-8) # $integral # [1] 0.1376102 # $error # [1] 1.372980e-09 # ... Hans Werner Kindly guide. Regards Sarah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] double integral
suman dhara suman.dhara89 at gmail.com writes: Sir, I want to calculate double integral in R. Is there any function to do this? If your domain of integration is a hypercube, try packages 'cubature' or 'R2cuba'. Otherwise, you have to uncover more information about your specific problem and/or function. Hans Werner Regards, Suman Dhara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logarithmic integrals in R?
Oliver Kullmann O.Kullmann at swansea.ac.uk writes: Thanks for the information. What I meant were formulas like \int 1/\log(t)^2 dt = -t/\log(t) + li(t) \int 1/\log(t)^3 dt = 1/2 * ( -t/\log(t)^2 - t/\log(t) + li(t) ) and higher forms that can be expressed through the Gamma function. I am certain I 've seen them in AandS' handbook (where else?), but sure cannot remember in which chapter or page. Which logarithmic integrals do you really need, and on what range? Regards, Hans Werner On Sat, May 29, 2010 at 01:15:29PM +, Hans W. Borchers wrote: Oliver Kullmann O.Kullmann at swansea.ac.uk writes: Hello, I couldn't find information on whether the logarithmic integrals Li_m(x) = integral_0^x log(t)^(-m) dt for x = 0 are available in R? [...] I found gsl at http://cran.r-project.org/web/packages/gsl/index.html. and elliptic integrals are part of the 'gsl' package, so library('gsl') x - seq(2, 10, by=0.5) y - expint_Ei(log(x)) y See e.g. the Handbook of Mathematical Functions for how to reduce higher logarithmic integrals. However here I wasn't succesful: Going through the chapter http://www.math.ucla.edu/~cbm/aands/page_228.htm I didn't find any mentioning of the higher logarithmic integrals. [...] Also a google search on higher logarithmic integrals, logarithmic integrals or li_n(x) doesn't reveal anything, so I would be thankful for a hint. Thanks again! Oliver __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logarithmic integrals in R?
Oliver Kullmann O.Kullmann at swansea.ac.uk writes: Hello, I couldn't find information on whether the logarithmic integrals Li_m(x) = integral_0^x log(t)^(-m) dt for x = 0 are available in R? I saw your request only this weekend. The first logarithmic integral can be computed using the exponential integral Ei(x) per li(x) = Ei(log(x)) and elliptic integrals are part of the 'gsl' package, so library('gsl') x - seq(2, 10, by=0.5) y - expint_Ei(log(x)) y See e.g. the Handbook of Mathematical Functions for how to reduce higher logarithmic integrals. Another possibility is to use the Web API of 'keisan', the calculation library of Casio. Regards Hans Werner Best wishes Oliver __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best polynomial approximation
I guess you may be looking for the Remez algorithm. AFAIK there is no implementation in one of the R packages. You can find FORTRAN code in the Collected Algorithms of the ACM (no. 604) which probably could be called from R. There appears to exist a discrete, equi-distant(?) version as function 'remez' in the signal package, if that is of any help to you. I have never used it. Regards, Hans Werner P.S.: The Chebyshev polynomials do not compute the best polynomial approximation, but they provide a nice way to estimate the maximal distance to this best approximating polynomial. Patrizio Frederic wrote: Dear R-users, I learned today that there exists an interesting topic in numerical analysis names best polynomial approximation (BSA). Given a function f the BSA of degree k, say pk, is the polynomial such that pk=arginf sup(|f-pk|) Although given some regularity condition of f, pk is unique, pk IS NOT calculated with least square. A quick google tour show a rich field of research and many algorithms proposed for computing such a task. I was wondered if some of you knows about some R implementations (packages) for computing BSA. Many thanks in advance, Patrizio as usual I apologize for my fragmented English -- +- | Patrizio Frederic, PhD | Assistant Professor, | Department of Economics, | University of Modena and Reggio Emilia, | Via Berengario 51, | 41100 Modena, Italy | | tel: +39 059 205 6727 | fax: +39 059 205 6947 | mail: patrizio.frede...@unimore.it +- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/best-polynomial-approximation-tp2220439p2221042.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find a rectangle of maximal area
Hans W Borchers hwborchers at googlemail.com writes: For an application in image processing -- using R for statistical purposes -- I need to solve the following task: Given n (e.g. n = 100 or 200) points in the unit square, more or less randomly distributed. Find a rectangle of maximal area within the square that does not contain any of these points in its interior. If a, b are height and width of the rectangle, other constraints may have to be imposed such as a, b = 0.5 and/or 0.5 = a/b = 2.0 . The rectangle is allowed to touch the border of the square. And yes, the sides of the rectangle shall be parallel to the sides of the enclosing unit square (which could be a rectangle of some size, too). snip Thanks in advance for any suggestions, Hans Werner Erwin Kalvelagen erwin.kalvelagen at gmail.com writes: I solved this with a simple minded MINLP formulation using BARON (a global solver). This seems to produce solutions relatively quickly (somewhat slower for n=200). Actually this solved easier than I expected. See: Dear Erwin, yes, it is possible to emulate an exhaustive search by applying binary variables and utilizing an MI(N)LP solver. What did you need the'non- linearity' for? (I am asking as you did not disclose your model.) The examples on your blog do not take into account that the ratio of longer to shorter side length of the rectangle shall be smaller than 2. Would it be difficult to add this restriction to your model? Unfortunately, there is no free MINLP solver available. Formerly I have called a Python program to utilize solvers at NEOS. Probably it would be possible to write a similar R function to do this. Still I believe that a clever approach might be possible avoiding the need to call a commercial solver. I am getting this hope from one of Jon Bentley's articles in the series Programming Pearls. Regards, Hans Werner P.S.: If you copy my request into your blog, wouldn't it be nice to add a pointer back to the R-help entry where this question has been asked? http://yetanothermathprogrammingconsultant.blogspot.com/2010/03/ looks-difficult-to-me-2.html Erwin Kalvelagen Amsterdam Optimization Modeling Group erwin at amsterdamoptimization.com http://amsterdamoptimization.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find a rectangle of maximal area
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: On Mon, Mar 22, 2010 at 4:28 PM, Hans W Borchers hwborchers at googlemail.com wrote: Still I believe that a clever approach might be possible avoiding the need to call a commercial solver. I am getting this hope from one of Jon Bentley's articles in the series Programming Pearls. Is this the 'Largest Empty Rectangle' problem? http://en.wikipedia.org/wiki/Largest_empty_rectangle Dear Barry, thanks for this pointer. I never suspected this problem could have a name of its own. Rethinking the many possible applications makes it clear: I should have searched for it before. I looked in some of the references of the late 80s and found two algorithms that appear to be appropriate for implementation in R. The goal is to solve the problem for n=200 points in less than 10-15 secs. Thanks again, Hans Werner I had a look at some of the references from Wikipedia, but they all follow a similar pattern, one I have noticed in many computer science journal articles: 1. State a problem that looks tricky. 2. Say We have an efficient algorithm for the problem stated in #1 3. Proceed to derive, using much algebra and theory, the efficient algorithm. 4. Stop. The idea of actually producing some dirty, filthy, actual code to implement their shiny algorithms never seems to cross their minds. I also found a similar question from 2008 asked on the R-sig-geo mailing list. That didn't get much help either! Sorry. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Find a rectangle of maximal area
For an application in image processing -- using R for statistical purposes -- I need to solve the following task: Given n (e.g. n = 100 or 200) points in the unit square, more or less randomly distributed. Find a rectangle of maximal area within the square that does not contain any of these points in its interior. If a, b are height and width of the rectangel, other constraints may have to be imposed such as a, b = 0.5 and/or 0.5 = a/b = 2.0 . The rectangle is allowed to touch the border of the square. For each new image the points will be identified by the application, like all stars of a certain brightness on an astronomical picture. So the task will have to be performed several times. I assume this problem is computationally hard. I would like to find a solution that is reasonably fast for n = 100..200 points. Exhaustive search along the x, y coordinates of the points will not be fast enough. I know this request is not about R syntax and does not contain 'repro-code'. But perhaps a somehow similar problem has been solved before. Thanks in advance for any suggestions, Hans Werner P.S.: Example Is this rectangle of maximal area? n - 100; set.seed(832) x - runif(n); y - runif(n) plot(c(0,1), c(0,1), type=n, axes=FALSE, asp=1, xlab=, ylab=, main=Rectangle Problem, sub=) rect(0,0,1,1, col=darkblue) xl-100; yu-43; xr-40; yo-3 area - (x[xr]-x[xl])*(y[yo]-y[yu]) rect(x[xl], y[yu], x[xr], y[yo], lty=2, col=darkblue, border=red) text((x[xl]+x[xr])/2, (y[yu]+y[yo])/2, paste(area =, round(area, 4)), cex=0.75, col=red) points(x, y, pch=20, col=white) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting
David Neu david at davidneu.com writes: Hi, I have a list of vectors (of varying lengths). I'd like to sort this list by applying a function to each pair of vectors in the list and returning information to sorting routine that let's it know which one is larger. To solve problems like this in Common Lisp, the sort function accepts a function as an argument. The arguments to this function are two elements of the list which is being sorted. The writer of the function returns t (TRUE in R) when the first argument to the function is larger than the second and nil (FALSE in R) otherwise. I'm wondering if there is some way to accomplish this in R. Would the following function do what you want? sortList - function(L, fun) L[order(sapply(L, fun))] Here is my test and my understanding of your request; L - list() # define a list of vectors of varying length for (i in 1:10) { n - sample(1:10, 1); L[[i]] - runif(n) } Ls - sortList(L, mean) sapply(Ls, mean) # increasing mean values Hans Werner Many thanks for any help! Cheers, David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting
David Neu david at davidneu.com writes: David Neu david at davidneu.com writes: Hi, I have a list of vectors (of varying lengths). I'd like to sort this list by applying a function to each pair of vectors in the list and returning information to sorting routine that let's it know which one is larger. To solve problems like this in Common Lisp, the sort function accepts a function as an argument. The arguments to this function are two elements of the list which is being sorted. The writer of the function returns t (TRUE in R) when the first argument to the function is larger than the second and nil (FALSE in R) otherwise. I'm wondering if there is some way to accomplish this in R. I don't know whether there is a way to do it with the 'base::sort' function -- and I too would very much like to know for an application of my own --, but you can always define your own sorting, like here a simple bubble sort: bubbleSort.list - function(L, comp) { stopifnot(is.list(L)) n - length(L) if (n = 1) return(L) for (i in 1:n) { for (j in 2:n) { b - L[[j]] if (comp(L[[j]], L[[j-1]])) { L[[j]] - L[[j-1]] L[[j-1]] - b } } } return(L) } If your comparing function, for example, compares first length and then mean: comp - function(L1, L2) { if (length(L1)length(L2) || (length(L1)==length(L2) mean(L1)mean(L2))) return(TRUE) else return(FALSE) } then the following test example will turn out to be correct: L - list() for (i in 1:100) L[[i]] - runif(sample(1:20, 1)) Ls - bubbleSort.list(L, comp) is.unsorted(sapply(Ls, length)) # incl. mean for equal length If bubblesort is too slow, implement your own heapsort or quicksort. Many thanks for any help! Cheers, David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an optimization problem: selecting an quot; optimalquot; subset
Erwin Kalvelagen-2 wrote: Hans W Borchers hwborchers at googlemail.com writes: # Prepare inputs for MILP solver obj - c(rep(0, n), 0, 1, 1, 0) typ - c(rep(B, n), B, C, C, B) mat - matrix(c(s, -z, -1, 1, 0,# a = a_p + a_m rep(0, n), 1, 0, 0, 0, # constant term rep(0, n), 0, 1, 0, -M, # a_p = M * d0 rep(0, n), 0, 0, 1, M, # a_m = M * (d0-1) rep(1, n), 0, 0, 0, 0), # subset size = k nrow=5, byrow=T) dir - c(==, ==, =, =, =) rhs - c(0, 1, 0, M, k) max - FALSE You can drop the binary variable d0. The condition one of a_p,a_m is zero holds automatically as we are minimizing a_p+a_m. Erwin Kalvelagen Right; for me adding M and d0 is kind of a reflex, but that is only necessary when a_p + a_m is used as an intermediate result. One more remark: I saw some spurious behavior with both solvers (Rsymphony, Rglpk) -- that is, slightly different results in different runs. It could relate to the tolerance with which these solvers compare and identify solutions. At the moment I don't know how to change the tolerance as a parameter. I guess this will not happen when using a more powerful solver such as CPLEX. Hans Werner -- View this message in context: http://n4.nabble.com/Solving-an-optimization-problem-selecting-an-optimal-subset-tp1446084p1459843.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an optimization problem: selecting an quot;optimalquot; subset
Dimitri Shvorob dimitri.shvorob at gmail.com writes: Given vector of numbers x, I wish to select an n-subset with sum closest fixed value s. Can anyone advise me how to approach this, in R? I have considered Rcplex package, which handles integer/binary linear/quadratic optimization problems, but have difficulty setting up the quadratic form for [sum(x) - s]^2. (Dynamic programming over [0, sum(x)]? A genetic algorithm? Can anyone contribute a binary GA optimization sample?) Here is a solution completely done with R and the MILP solver Symphony provided in the 'Rsymphony' package (but 'Rglpk' will work the same). It finds subsets of size = k in a set s of size n. The absolute deviation from the target value z is simulated by two additional continuous (a_p and a_m) and and one binary variable d0. To force positivity, the common Big-M trick is employed. Hans Werner library(Rsymphony) n - 100; k - 20 # Find k-subset in set of size n z - 2.0# whose elements sum up to z set.seed(8232); s - runif(n) # reference example M - sum(s) + z # for the Big-M trick # Prepare inputs for MILP solver obj - c(rep(0, n), 0, 1, 1, 0) typ - c(rep(B, n), B, C, C, B) mat - matrix(c(s, -z, -1, 1, 0,# a = a_p + a_m rep(0, n), 1, 0, 0, 0, # constant term rep(0, n), 0, 1, 0, -M, # a_p = M * d0 rep(0, n), 0, 0, 1, M, # a_m = M * (d0-1) rep(1, n), 0, 0, 0, 0), # subset size = k nrow=5, byrow=T) dir - c(==, ==, =, =, =) rhs - c(0, 1, 0, M, k) max - FALSE # Apply Sypmphony as our MILP solver sol - Rsymphony_solve_LP(obj, mat, dir, rhs, types=typ, max = max) x - sol$solution[1:n] # Solution vector describing k-subset ( i - (1:n)[x==1] )#= [1] 18 25 44 81 95 print(sum(s[i]), digits=10) #= [1] 2.00205 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an optimization problem: selecting an optimal subset
Dimitri Shvorob wrote: Same request to Hans: I am afraid I need a little more spoon-feeding following I sent a GAMS script modeling this problem to the NEOS solvers Thanks a lot! If you have access to CPLEX (I mean the commercial program, not Rcplex which is just an interface to it), then I would suggest to follow Erwin's proposal. My GAMS file looks similar to the one Erwin posted. For instructions how to send it to the NEOS solvers, see http://neos.mcs.anl.gov/neos/solvers/. To replace an absolute value by two binary variables is an old trick in optimization modeling. Here it will transform a subset sum problem into an MILP task, solvable in R with lpSolve, Rglpk or Rsymphony. But with a 100 binary variables it may be too much for these packages. I will give it a try today. Hans Werner -- View this message in context: http://n4.nabble.com/Solving-an-optimization-problem-selecting-an-optimal-subset-tp1446084p1457982.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an optimization problem: selecting an quot;optimalquot; subset
Dimitri Shvorob dimitri.shvorob at gmail.com writes: Is it a subset of a vector containing 100 elements, or 1ths? I need to pick 2-40 elements out of a 50-200-element-long vector. A random number of elements that should be chosen, or the best 10 values which sums up to a defined value? The best 10 values. I still think that Rcplex is the way to go; what's missing is some linear-algebra expertise on my part to set up the problem as quadratic. This is a subset sum problem and has been discussed here in December for integers. For floats it is even more difficult as there is no reliable stopping criterion. Can you settle for an approximate solution? If you insist on the true minimum, you will have to live with the combinatorial explosion searching through all (appropriate) subsets. Rcplex: This is a combinatorial problem and cannot be formulated as a quadratic optimization problem. It is NP-hard and cannot be solved via Dynamic Programming. As a combinatorial problem, it also will not yield to GA approaches, at least not to find the minimum reliably. Hans Werner P.S.: Example (solution maybe is the best possible) set.seed(8232) x - runif(100) # Find subset that sums up close to 2.0 ! i - c(84,54,11,53,88,12,26,45,25,62,96,23,78,77,66,1) sum(x[i]) #= 2.000451 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an optimization problem: selecting an quot;optimalquot; subset
Dimitri Shvorob dimitri.shvorob at gmail.com writes: This is a subset sum problem and has been discussed here in December Thanks a lot! Will investigate. Can you settle for an approximate solution? Absolutely. You can use the script from the thread subset sum problem to find approximate solutions. The trick is to round and multiply the numbers in the set with some power of 10 and then solve as an integer problem. I did that for the example and found a solution 2.02 for target 2.0 and with a subset of length 15. Rcplex: This is a combinatorial problem and cannot be formulated as a quadratic optimization problem. If the objective function can fit the pattern, we need to find the set of n coefficients, taking values 0 or 1, summing to m, for the m-out-of-n problem. 'Binary' version of Rcplex apparently would be able to handle that. Right, you can force CPLEX to perform the exhaustive search itself, I would not call this quadratic programming, but it might be quite fast. Taking my example set.seed(8232); x - runif(100) with target value 2.0 and subsets of fixed length 10, I sent a GAMS script modeling this problem to the NEOS solvers and it returned the solution i - c(7,10,25,32,45,76,77,78,81,96) sum(x[i]) # [1] 2.06 within 0.004 seconds of execution time for the BARON MINLP solver. It is NP-hard and cannot be solved via Dynamic Programming. Why not? Discretize the [0, sum(x)] range and solve an m-step DP problem. The value function would minimize the distance from s, and penalize too-short (m* m) subsets. No, this will not work. Hans Werner Thanks again! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with the precision of numbers
kayj kjaja27 at yahoo.com writes: Hi All, thank you all for your help. I have tried Bill's script and it works! so I am trying to think what was the problem and it looks like it i sthe precision. so you defined a function of the precision and evaluates at precision=500. Bill, I was wondering how did you choose 500? is it arbitrary? Your question was I was wondering if it is possible to increase the precision using R, but you never told which accuracy you really need, As a rule of thumb, the number of correct decimal digits is about 0.3 * precision, so for 50 digits you could set the precision to 256 to Be on the safe side. And if you are wondering how to compute the binomial coefficients with 'Rmpfr', here's one possibility to do that: require(Rmpfr) mpfrChoose - function(a, b, prec=128) { m1 - mpfr(1, prec=prec) # r - gamma(m1+a) / (gamma(m1+b) * gamma(m1+a-b)) # if (is.whole(r)) return(r) else return(round(r)) gamma(m1+a) / (gamma(m1+b) * gamma(m1+a-b)) } An advantage of using 'Rmpfr' is that the power of R can be applied, for example vectorization, so avoid 'for' loops if possible: pr - 256 m_1 - mpfr(-1, pr) m1 - mpfr(1, pr) i - mpfr(0:80, pr) s - sum((m_1^i) * mpfrChoose(80, i, prec=pr) * (m1-(i+1)*1/200)^200) print(s, digits=50) # 1 'mpfr' number of precision 256 bits # [1] 6.6568524789662037699673275771182135789699510194000e-20 Hans Werner thanks again for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with the precision of numbers
Ted.Harding at manchester.ac.uk writes: [...] I suspect this is an invented computation -- the 3456 strikes me as unlikely (it reminds me of my habitual illustrative use of set.seed(54321)). There is a definite problem with the development given by kayj. When k=2000 and i=k, the formula requires evaluation of 3456*(2^3000) on a log10 scale this is log10(3456) + 3000*log10(2) = 906.6286 Since R gives up at 10^308.25471 = 1.79767e+308 (10^308.25472 = Inf), this algorithm is going to be tricky to evaluate! Just to finish this off: 'Rmpfr' works fine with this example case and computes the total sum in less than 2 seconds: library(Rmpfr) j - mpfr(-1, 120) i - mpfr(0:2000, 120) s - sum(j^i * 3456 * (1+i*1/2000)^3000) s # 1 'mpfr' number of precision 120 bits # [1] 2.8876826480125324342665158348085465188e906 which is the same result that Maple returns. But Maple (I admit, mine is quite old) and 'bc' (couldn't wait for the answer) are much slower at it. It is said that 'gmp' (on which Rmpfr is based) is faster than any other multi-precison library on earth --- seems to be true. Hans Werner I don't know how well Rmpfr copes with very large numbers (the available documentation seems cryptic). However, I can go along with the recommendation in the URL the Ben gives, to use 'bc' (Berkeley Calculator), available on unix[oid] systems since a long time ago. That is an old friend of mine, and works well (it can cope with exponents up to X^2147483647 in the version I have). It can eat for breakfast the task of checking whether Kate Bush can accurately sing pi to 117 significant figures: http://www.absolutelyrics.com/lyrics/view/kate_bush/pi (Try it in R). Ted. E-Mail: (Ted Harding) Ted.Harding at manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 19-Jan-10 Time: 18:41:27 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization problem
Ravi Varadhan rvaradhan at jhmi.edu writes: Interesting! Now, if I change the cost matrix, D, in the LSAP formulation slightly such that it is quadratic, it finds the best solution to your example: Dear Ravi, I thought your solution is ingenious, but after the discussion with Erwin Kalvelagen I found two things quite irritating: (1) Why is solve_LSAP(D) different from solve_LSAP(D^2) in Erwin's example? I believed just squaring the distance matrix would make no difference to solving the LSAP - except for some numerical instability which does not seem to be the case here. (2) If you change rows and sums in the definition of D, that is D[j, i] - sqrt(sum((B[, j] - A[, i])^2)) then the solution to Erwin's example comes out right even with keeping the square root. Do you have explanations for these 'phenomena'? Otherwise, I think, there will remain some doubts about this approach. Very best Hans Werner pMatrix.min - function(A, B) { # finds the permutation P of A such that ||PA - B|| is minimum # in Frobenius norm # Uses the linear-sum assignment problem (LSAP) solver # in the clue package # Returns P%*%A and the permutation vector `pvec' such that # A[pvec, ] is the permutation of A closest to B n - nrow(A) D - matrix(NA, n, n) for (i in 1:n) { for (j in 1:n) { # D[j, i] - sqrt(sum((B[j, ] - A[i, ])^2)) D[j, i] - (sum((B[j, ] - A[i, ])^2)) # this is better } } vec - c(solve_LSAP(D)) list(A=A[vec,], pvec=vec) } X-pMatrix.min(A,B) X$pvec [1] 6 1 3 2 4 5 dist(X$A, B) [1] 10.50172 This should be fine. Any counter-examples to this?! Best, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization problem
Ravi Varadhan rvaradhan at jhmi.edu writes: Dear Hans, I agree with your comments. My intuition was that the quadratic form would be better behaved than the radical form (less nonlinear!?). So, I was hoping to see a change in behavior when the cost function was altered from a radical (i.e. sqrt) form to quadratic, but I was still surprised to actually see it. I don't have a good explanation for this. I came up with the idea of solving Klaus' problem as an LSAP problem. I did not know that the LSAP approach to this and similar problems has already been considered by Nick Higham. I asked Nick last night about this problem thinking that he might know of more direct solutions to this problem (e.g. some kind of SVD or related factorizations). He said that I should take a look at the PhD thesis of one of his students. Thanks for pointing out the thesis. As I understand, the (one-sided) Procrustes problem is finding an orthogonal matrix minimizing min! || A R - B || , t(R) R = I , ||.|| the Frobenius norm and that there is an substantial theory behind in numerical linear algebra. The basic literature appears to be Gower, J.C; Dijksterhuis, G.B., Procrustes Problems, Oxford Statistical Science Series, Oxford University Press, 2004 The thesis itself will lead us astray as it treats the two-sided Procrustes Problem, which is not our main concern. The request on R-help only asked for permutation matrices P (from left or right) minimizing min! || P A - B || so I still think that a direct approach as you have suggested is possible given this apparently much simpler problem. Take your definition of D with quadratic terms: The LSAP approach will find a permutations of the rows of B, say Bp, such that the (linear!) sum over D_{ip(i)} is minimal, that is sum over rows {sum d(Bp - A)^2} = sum over all {(Bp - A)^2} which is exactly square of the Frobenius norm. If you instead apply your first definition with square roots, it is sum over rows {sum sqrt(d(Bp - A)^2)} and this cannot be expanded to the sum of the Frobenius norm. Therefore, the quadratic approach is indeed correct and will lead to a true optimum, while the first variant will not. Hans Werner Take a look at Section 3.5 of the PhD thesis Parallel Solution of SVD-Related Problems, With Applications http://www.maths.manchester.ac.uk/~higham/misc/past-students.php This thesis proposed algorithms for the current problem and different versions of it under the heading of Procrustes-type problems. I have to read this and get a better handle on it. I would not be able to get to this for another two weeks. If you have any insights from this, in the meanwhile, do share with us. Best regards, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count the total number of (INCLUDING overl apping) occurrences of a substring within a string ?
Gabor Grothendieck ggrothendieck at gmail.com writes: Use a zero lookaround expression. It will not consume its match. See ?regexp gregexpr(a(?=a), aaa, perl = TRUE) [[1]] [1] 1 2 attr(,match.length) [1] 1 1 I wonder how you would count the number of occurrences of, for example, 'aba' or 'a.a' (*) in the string ababacababab using simple lookahead? In Perl, there is a modifier '/g' to do that, and in Python one could apply the function 'findall'. When I had this task, I wrote a small function findall(), see below, but I would be glad to see a solution with lookahead only. Regards Hans Werner (*) or anything more complex findall - function(apat, atxt) { stopifnot(length(apat) == 1, length(atxt) == 1) pos - c() # positions of matches i - 1; n - nchar(atxt) found - regexpr(apat, substr(atxt, i, n), perl=TRUE) while (found 0) { pos - c(pos, i + found - 1) i - i + found found - regexpr(apat, substr(atxt, i, n), perl=TRUE) } return(pos) } On Sun, Dec 20, 2009 at 1:43 AM, Jonathan jonsleepy at gmail.com wrote: Last one for you guys: The command: length(gregexpr('cus','hocus pocus')[[1]]) [1] 2 returns the number of times the substring 'cus' appears in 'hocus pocus' (which is two) It's returning the number of **disjoint** matches. So: length(gregexpr('aa','aaa')[[1]]) [1] 1 returns 1. **What I want to do:** I'm looking for a way to count all occurrences of the substring, including overlapping sets (so 'aa' would be found in 'aaa' two times, because the middle 'a' gets counted twice). Any ideas would be much appreciated!! Signing off and thanks for all the great assistance, Jonathan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count the total number of (INCLUDING overl apping) occurrences of a substring within a string ?
Gabor Grothendieck ggrothendieck at gmail.com writes: Try this: findall(aba, ababacababab) [1] 1 3 7 9 gregexpr(a(?=ba), ababacababab, perl = TRUE) [[1]] [1] 1 3 7 9 attr(,match.length) [1] 1 1 1 1 findall(a.a, ababacababab) [1] 1 3 5 7 9 gregexpr(a(?=.a), ababacababab, perl = TRUE) [[1]] [1] 1 3 5 7 9 attr(,match.length) [1] 1 1 1 1 1 Thanks --- somehow I did not realize that the expression in ?=... can also be regular. My original problem was to find all three character matches where the first and the last one are the same. With findall() it works like: findall((.).\\1, ababacababab) # [1] 1 2 3 5 7 8 9 10 I am still not able to reproduce this with lookahead. Attempts with gregexpr((.)?=.\\1, ababacababab, perl = TRUE) do not work as the lookahead expression apparently does not know about the captured group from before. Regards Hans Werner Correction: I meant the '\G' metacharacter in Perl, not a modifier. On Sun, Dec 20, 2009 at 7:22 AM, Hans W Borchers hwborchers at googlemail.com wrote: Gabor Grothendieck ggrothendieck at gmail.com writes: [Sorry; Gmane forces me to delete more quoted text.] findall - function(apat, atxt) { stopifnot(length(apat) == 1, length(atxt) == 1) pos - c() # positions of matches i - 1; n - nchar(atxt) found - regexpr(apat, substr(atxt, i, n), perl=TRUE) while (found 0) { pos - c(pos, i + found - 1) i - i + found found - regexpr(apat, substr(atxt, i, n), perl=TRUE) } return(pos) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset sum problem.
Geert Janssens janssens-geert at telenet.be writes: Hi, I'm quite new to the R-project. I was suggested to look into it because I am trying to solve the Subset sum problem, which basically is: Given a set of integers and an integer s, does any non-empty subset sum to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) I have been searching the web for quite some time now (which is how I eventually discovered that my problem is called subset sum), but I can't seem to find an easily applicable implementation. I did search the list archive, the R website and used the help.search and apropos function. I'm afraid nothing obvious showed up for me. Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Is it really true that you only want to see a Yes or No answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of Y-or-N): # Exact algorithm for the SUBSET SUM problem exactSubsetSum - function(S, t) { S - S[S = t] if (sum(S) t) return(FALSE) S - sort(S, decreasing=TRUE) n - length(S) L - c(0) for (i in 1:n) { L - unique(sort(c(L, L + S[i]))) L - L[L = t] if (max(L) == t) return(TRUE) } return(FALSE) } # Example with a set of cardinality 64 amount - 4748652 products - c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,8,33000,33000,55000,77382,48048, 51186,4,35000,21716,63051,15025,15025,15025, 15025,80,111,59700,25908,829350,1198000,1031655) # Timing is not that bad system.time( sol - exactSubsetSum(products, amount) ) # user system elapsed # 0.516 0.096 0.673 sol # [1] TRUE Thank you very much. Geert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset sum problem.
Geert Janssens janssens-geert at telenet.be writes: On Wednesday 9 December 2009, Hans W Borchers wrote: Geert Janssens janssens-geert at telenet.be writes: [ ... ] Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Is it really true that you only want to see a Yes or No answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of Y-or-N): Unfortunately no. I do need the numbers in the subset. But thank you for presenting this code. Geert Okay then, here we go. But don't tell later that your requirement was to generate _all_ subsets that add up to a certain amount. I will generate only one (with largest elements). For simplicity I assume that the set is prepared s.t. it is decreasingly ordered, has no elements larger than the amount given, and has a total sum larger than this amount. # Assume S decreasing, no elements t, total sum = t solveSubsetSum - function(S, t) { L - c(0) inds - NULL for (i in 1:length(S)) { # L - unique(sort(c(L, L + S[i]))) L - c(L, L+S[i]) L - L[L = t] if (max(L) == t) { inds - c(i) t - t - S[i] while (t 0) { K - c(0) for (j in 1:(i-1)) { K - c(K, K+S[j]) if (t %in% K) break } inds - c(inds, j) t - t - S[j] } break } } return(inds) } # former example amount - 4748652 products - c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,8,33000,33000,55000,77382,48048, 51186,4,35000,21716,63051,15025,15025,15025, 15025,80,111,59700,25908,829350,1198000,1031655) # prepare set prods - products[products = amount] # no elements amount prods - sort(prods, decreasing=TRUE) # decreasing order # now find one solution system.time(is - solveSubsetSum(prods, amount)) # user system elapsed # 0.320 0.032 0.359 prods[is] # [1] 70740 70740 71190 76950 77382 8 83800 # [8] 90900 456000 533600 829350 111 1198000 sum(prods[is]) == amount # [1] TRUE Note that running times and memory needs will be much higher when more summands are needed. To mention that too: I have not tested the code extensively. Regards Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modula Generators
Sam K upperhalfplane at yahoo.co.uk writes: Hi all, Is there function on R for calculating Modula generators? For example for primes above 100, e.g 157, i want to know which number generates the group under multiplication mod 157. i.e i want to find an element whose order is 156. The problem I occur is that modular arithmetic becomes inaccurate when dealing with large numbers. In other words, you are looking for a 'primitive root' in the field F_p of prime p. By the way, there are phi(p) of them, where phi is Euler's function. There is no known efficient algorithm for this, except the prime factorization of p-1 is given. And this algorithm is probabilistic, i.e., in some cases it may not terminate. In case your primes do not get too big, say between 100 and 1000, an exhaustive search may be possible, somewhat simplified by the Euler criterion. Or you locate and read in some tables listing primitive roots. You can find more information in Wikipedia or MathWorld. For exact modular arithmetic you may turn to the 'gmp' R package. Regards Hans Werner Thanks for any help given. Sam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solve linear program without objective function
Andreas Wittmann andreas_wittmann at gmx.de writes: Dear R-users, i try to solve to following linear programm in R 0 * x_1 + 2/3 * x_2 + 1/3 * x_3 + 1/3 * x_4 = 0.3 x_1 + x_2 + x_3 + x_4 = 1 x_1, x_2, x_3, x_4 0, x_1, x_2, x_3, x_4 1 as you can see i have no objective function here besides that i use the following code. library(lpSolve) f.obj-c(1,1,1,1) f.con-matrix(c(0,2/3,1/3,1/3, 1,1,1,1, 1,0,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1),nrow=6,byrow=TRUE) f.dir - c(=, =, , , , ) f.rhs - c(0.3, 1, 0, 0, 0, 0) lp (max, f.obj, f.con, f.dir, f.rhs)$solution the problem is, the condition x_1, x_2, x_3, x_4 0 is not fulfilled. With strict inequalities x_i 0 your problem will not have a solution. That is why in linear programming strict inequalities are replaced with non-strict inequalities, or as the lp_solve manual says: The inequalities can be =, = or = Because all numbers are real values, = is the same as and = is the same as . Try x_i = eps 0 with some eps appropriate for your problem. Hans Werner Any advice would be very helpful. best regards Andreas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modular inverses
SJ Robson-Davis sr6827 at bristol.ac.uk writes: I want to find the inverse of an integer k mod p (prime.) Is there a function that can do this for me? I know i could simply write (k^(p-2)) %% p, but i need to do this for large primes (above 100) and this gives the warning message: probable complete loss of accuracy in modulus so this method does not work. Any ideas? Computing the inverse mod p applying brute force may not be a very elegant approach. With a little knowledge from number theory it is clear that the (so-called) extended Euclidean algorithms will provide the modular inverse efficiently, even for very big numbers. In R the extended Euclidean algorithm is available in the gmp package, see the 'gcdex' function. The modular inverse can also directly be calculated with the 'inv' function, e.g. the invers of 101 modulo 11 is: library(gmp) as.numeric(inv(as.bigz(101), as.bigz(11))) # [1] 499499501 It's not even necessary that p is prime, only that k and p are relatively prime. The integers can be bigger than those handled by double-precision arithmetics. Regards Hans Werner Thanks, Samuel -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim or nlminb for minimization, which to believe?
Your function named 'gradient' is not the correct gradient. Take as an example the following point x0, very near to the true minimum, x0 - c(-0.2517964, 0.4898680, -0.2517962, 0.4898681, 0.7500995) then you get gradient(x0) [1] -0.0372110470 0.0001816991 -0.0372102284 0.0001820976 0.0144292657 but the numerical gradient is different: library(numDeriv) grad(fn, x0) [1] -6.151645e-07 -5.507219e-07 1.969143e-07 -1.563892e-07 -4.955502e-08 that is the derivative is close to 0 in any direction -- as it should be for an optimum. No wonder, optim et al. get confused when applying this 'gradient'. Regards Hans Werner Doran, Harold wrote: I have constructed the function mml2 (below) based on the likelihood function described in the minimal latex I have pasted below for anyone who wants to look at it. This function finds parameter estimates for a basic Rasch (IRT) model. Using the function without the gradient, using either nlminb or optim returns the correct parameter estimates and, in the case of optim, the correct standard errors. By correct, I mean they match another software program as well as the rasch() function in the ltm package. Now, when I pass the gradient to optim, I get a message of successful convergence, but the parameter estimates are not correct, but they are *very* close to being correct. But, when I use nlminb with the gradient, I get a message of false convergence and again, the estimates are off, but again very close to being correct. This is best illustrated via the examples: ### Sample data set set.seed(1234) tmp - data.frame(item1 = sample(c(0,1), 20, replace=TRUE), item2 = sample(c(0,1), 20, replace=TRUE), item3 = sample(c(0,1), 20, replace=TRUE),item4 = sample(c(0,1), 20, replace=TRUE),item5 = sample(c(0,1), 20, replace=TRUE)) ## Use function mml2 (below) with optim with use of gradient mml2(tmp,Q=10) $par [1] -0.2438733 0.4889333 -0.2438733 0.4889333 0.7464162 $value [1] 63.86376 $counts function gradient 456 $convergence [1] 0 $message NULL $hessian [,1] [,2] [,3] [,4] [,5] [1,] 4.095479 0.00 0.00 0.00 0.00 [2,] 0.00 3.986293 0.00 0.00 0.00 [3,] 0.00 0.00 4.095479 0.00 0.00 [4,] 0.00 0.00 0.00 3.986293 0.00 [5,] 0.00 0.00 0.00 0.00 3.800898 ## Use same function but use nlminb with use of gradient mml2(tmp,Q=10) $par [1] -0.2456398 0.4948889 -0.2456398 0.4948889 0.7516308 $objective [1] 63.86364 $convergence [1] 1 $message [1] false convergence (8) $iterations [1] 4 $evaluations function gradient 414 ### use nlminb but turn off use of gradient mml2(tmp,Q=10) $par [1] -0.2517961 0.4898682 -0.2517961 0.4898682 0.7500994 $objective [1] 63.8635 $convergence [1] 0 $message [1] relative convergence (4) $iterations [1] 8 $evaluations function gradient 11 64 ### Use optim and turn off gradient mml2(tmp,Q=10) $par [1] -0.2517990 0.4898676 -0.2517990 0.4898676 0.7500906 $value [1] 63.8635 $counts function gradient 227 $convergence [1] 0 $message NULL $hessian [,1] [,2] [,3] [,4] [,5] [1,] 3.6311153 -0.3992959 -0.4224747 -0.3992959 -0.3764526 [2,] -0.3992959 3.5338195 -0.3992959 -0.3960956 -0.3798141 [3,] -0.4224747 -0.3992959 3.6311153 -0.3992959 -0.3764526 [4,] -0.3992959 -0.3960956 -0.3992959 3.5338195 -0.3798141 [5,] -0.3764526 -0.3798141 -0.3764526 -0.3798141 3.3784816 The parameter estimates with and without the use of the gradient are so close that I am inclined to believe that the gradient is correct and maybe the problem is elsewhere. It seems odd that optim seems to converge but nlminb does not with the use of the gradient. But, with the use of the gradient in either case, the parameter estimates differ from what I think are the correct values. So, at this point I am unclear if the problem is somewhere in the way the functions are used or how I am passing the gradient or if the problem lies in the way I have constructed the gradient itself. Below is the function and also some latex for those interested in looking at the likelihood function. Thanks for any reactions Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ltm_0.9-2 polycor_0.7-7 sfsmisc_1.0-9 mvtnorm_0.9-8 msm_0.9.4 MASS_7.3-3 MiscPsycho_1.5 [8] statmod_1.4.1 loaded via a namespace (and not attached): [1] splines_2.10.0 survival_2.35-7 tools_2.10.0
Re: [R] Implementation of the Shuffled Complex Evolution (SCE-UA) Algorithm
Simon Seibert simon.seibert at mytum.de writes: Good evening list, I'm looking for an R implementation of the Shuffled Complex Evolutionâ (SCE-UA) algorithm after Duan et al. (1993). Does anybody know if there is an extension/ package existing that contains it? Thanks very much for your help! Cheers, Simon I am looking into stochastic global optimization routines, such as variants of genetic algorithms (GA), particle swarm optimization (PSO), and shuffled complex (SCE) or differential evolution (DE). At the moment I am testing a free Matlab implementation of SCE-UA. If that turns out to be interesting and effective, I will translate it into an R function, maybe within the next few weeks -- except, of course, someone else is going to do an implementation of a similar procedure. Right now I am not totally convinced. Maybe you try your problem with other approaches available in R (see e.g. the Optimization task view). Regards Hans Werner Duan QY, Gupta KV, Sorooshian S (1993) Shuffled Complex Evolution Approach for Effective and Efficient Global Minimization. In Jour. of optimization theorie and applications. Vol 76, 3, 501-521 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help using R's linprog for LP
Medha Atre medha.atre at gmail.com writes: Hi, I found the reason. By default it puts a condition for x = 0. Is there a way to get rid of this condition? The constraints x = 0 are used in most linear programming realizations. Some bounds from below are needed. The trick to circumvent the restriction is as follows: Assume you know x = d where some or all of the d_i can be negative. Replace x with y = x - d = 0 and minimize c'y with Ay = b - A d ! Your solution is then x = y + d , that is solveLP(cvec, bvec - Amat %*% dvec, Amat)$solution + dvec Of course this works with all linear programming packages in R. [Good theorists] always make an even number of sign errors, and the bad ones always make an odd number of sign errors. -- Anthony Zee in: QFT in a Nutshell Medha On Fri, Oct 23, 2009 at 2:34 PM, Medha Atre medha.atre at gmail.com wrote: Hi, I am using R in one of my courses. I am trying to use R's linprog package to solve to formulate 2-class classification problem as Linear programming problem. For my formulation, I need to set to cvec to all 0s. I know the points are linearly separable so an optimal solution x does exist, which satisfies all the constraints. But given the constraints and setting cvec to all 0s it simply gives me an x of all 0s. How can I fix this? Medha __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ubuntu, Revolutions, R
I updated to Ubuntu 9.10 Beta yesterday, and yes I do see the same message and I am a bit irritated. I don't want to read these 'marketing' lines any time I start up R. I simply deleted the lines from /etc/R/Rprofile.site for now, but I am still wondering who put that in. Is there any deeper reason I didn't get ? Hans Werner gunksta wrote: For those who don't follow Ubuntu development carefully, the first Beta for the next Ubuntu was recently released, so I took my home system and upgraded to help out with filing bugs, etc. Just to be clear, I am not looking for help with the upgrade process. I've had R, and a few miscellaneous CRAN packages installed on this computer for years. Today, when I loaded an R session I had developed before the upgrade, I saw something new in my R welcome message. R version 2.9.2 (2009-08-24) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. This is REvolution R version 3.0.0: the optimized distribution of R from REvolution Computing. REvolution R enhancements Copyright (C) REvolution Computing, Inc. Checking for REvolution MKL: - REvolution R enhancements not installed. For improved performance and other extensions: apt-get install revolution-r The last part, about this being the enhanced version of R was . . . unexpected. I have heard of this company before and now I've spent some time on their website. Looking at my installation, Ubuntu did not install any of the REvolution Computing components, although R now basically thows a warning every time I start it. My question(s) for the community is this (pick any question(s) you like to answer: Should I install the REvolution Computing packages? Do these packages really make R faster? Are these packages stable? What are your experiences with REvolution Computing software? I am interested in hearing from members of the community, REvolution Computing employees/supporters (although please ID yourself as such) and most anyone else. I can see what they say on their website, but I'm interested in getting other opinions too. Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Ubuntu%2C-Revolutions%2C-R-tp25744817p25749786.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Semi continous variable- define bounds using lpsolve
I played around a bit with the original 'lp-solve' program --- i.e., not the R package but the program to be downloaded from Sourceforge ---, at least version 5.5.0.15 through its IDE. I was not even able to reproduce the example on semi-continuous variables in the reference documentation at http://lpsolve.sourceforge.net/5.5/. It appears as if a bug has been introduced in newer versions of 'lp-solve' that prevent these semi-continuous variables from being handled correctly. So maybe it's not the fault of the R package 'lpsolve' and can only be amended by the 'lp-solve' team itself. Hans Werner pragathichi wrote: How to define bounds for a semi continous variable in lp_solve. Min 5x1 +9x2 +7.15x3 +0.1x4 subject to x1+x2+x3+x4=6.7 x1+x4 = 6.5 And x3 can be 0 or greater than 3.6 hence x3 is a semi continous variable how to define bounds as well as semicontinous function because using set.semicont and set. bound simantaneously doesn't seem to work.Thanks in advance for the help -- View this message in context: http://www.nabble.com/Semi-continous-variable--define-bounds-using-lpsolve-tp25530668p25530901.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Semi continous variable- define bounds using lpsolve
But of course, it is always possible to emulate a semi-continuous variable by introducing a binary variable and use some big-M trick. That is, with a new binary variable b we add the following two conditions: x3 - 3.6 * b = 0 and x3 - 10 * b = 0 # Big-M trick, here M = 10 (If b = 0, then x3 = 0, and if b = 1, then x3 = 3.6 !) As I do not trust 'lpSolve' too much anymore I used package 'Rglpk' with the following code: #-- snip --- library(Rglpk) obj- c(5, 9, 7.15, 0.1, 0) mat - matrix(c(1,1,1,1,0, 1,0,0,1,0, 0,0,1,0,-3.6, 0,0,1,0,-10, 0,0,0,0,1), byrow=TRUE, ncol=5) dir - c(==, =, =, =, =) rhs - c(9, 6.55, 0, 0, 1) types - c(C, C, C, C, I) max - FALSE Rglpk_solve_LP(obj, mat, dir, rhs, types, max = max) # $optimum # [1] 22.705 # # $solution # [1] 0.00 2.45 0.00 6.55 0.00 # # $status # [1] 0 #-- snap --- Semi-continuous variables are sometimes preferred as with a good implementation the solution is reached much faster (that's why I suggested them), but they can always be modelled with binary variables. Hans Werner pragathichi wrote: How to define bounds for a semi continous variable in lp_solve. Min 5x1 +9x2 +7.15x3 +0.1x4 subject to x1+x2+x3+x4=6.7 x1+x4 = 6.5 And x3 can be 0 or greater than 3.6 hence x3 is a semi continous variable how to define bounds as well as semicontinous function because using set.semicont and set. bound simantaneously doesn't seem to work.Thanks in advance for the help -- View this message in context: http://www.nabble.com/Semi-continous-variable--define-bounds-using-lpsolve-tp25530668p25530929.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadratic Constraints
The package lpSolve (that I have recommended before) supports so-called 'semi-continuous variables', that is Semi-continuous variables are variables that must take a value between their their minimum and maximum or zero. So these variables are treated the same as regular variables, except that a value of zero is also accepted, even if a minimum bigger than zero is set on the variable. which exactly how you want to handle your variable x3. For an example, see the documentation at lpsolve.sourceforge.net/5.5/. By the way, the minimum of your problem is 44.64 (manual calculation). vikrant S wrote: HI All, I am unable to solve a optimization Problem Please Help Me out of this to solve. The Optimization problem is as follows :- My objective function is linear and one of the constraint is quadratic. Min z = 5 * X1 + 9* X2 + 7.15 *X3 + 2 * X4 subject to X1 + X2 + X3 +X4 = 9 X1 + X4 = 6.55 X3(X3 - 3.5) =0 X1,X2,X3,X4 =0 Now the problem is how to solve this kind of problem. Which package should be used to handle such problems. Please explain with an example. Another problem is that I have to cases to be solve in this problem. case 1:-) If X3 = 0 case 2 :-) If X3 0 then X3 3.6 I want to handle both this case in one problem so the quadratic constraints is written The thing is that I want to evaluate my objective function for both cases and which ever is optimum that solution i need, Here I don't want to use the If Else condition and repeat the program. IS there any other better way in which i could solve this problem? If not please try to provide me the solution for my original problem having a quadratic constraint. -- View this message in context: http://www.nabble.com/Quadratic-Constraints-tp25528480p25529374.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computing the radius of an arc
Nair, Murlidharan T mnair at iusb.edu writes: Alex Brenning, the developer of the RSAGA package told me that and I quote the RSAGA package (which uses functions from the free geographical information system [GIS] SAGA GIS) has a curvature function that is designed to calculate the curvature of surfaces, in particular raster (i.e. gridded) digital elevation models. I am not aware of a function in SAGA GIS or other GIS that would calculate curvatures along a line, especially not in 3D I have difficulties seeing how you would go about to do this. A surface locally is characterized by two main curvatures --- this is Gauss' theorem for 2-dim geometry. The sphere has the same curvature, 1/r, in both directions. So maybe you can approximate the surface by an ellipsoid, but a strange one as the main directions may not be orthogonal. I am not sure this is what you wanted to achieve. As a side remark: The circle approximating a curve in three dimensions is not uniquely determined. Regards Hans Werner I shall try to develop it and if I am successful I shall make it available. Cheers../Murli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re ading jpeg images?
I found the 'biOps' package for Image and data analysis quite helpful. (I did some astronomical investigations with it --- counting galaxies in a Hubble picture---and I do recommend this package.) Under Windows you have to unpack the 'libjpeg' and 'libtiff' libraries beforehand somewhere in your path. See finzi.psych.upenn.edu/Rhelp08/2009-April/194630.html for a link to download all necessary libraries as a zip-file. If this link to RapidShare does not work anymore, I think I still have that file somewhere and could send it to you by e-mail. Regards Hans Werner Robert Biddle wrote: Can someone advise me what the most sensible way is to read jpeg images? For our work with image analysis and eye-tracking we have been using the rimage package, on both Macs and Windows PCs. But while setting up a new Windows machine yesterday, I see that rimage is regarded as orphaned, and no Windows binary is available. I eventually found an old zip file for the package, so I am not stuck, but I wonder what the right way is to go forward. I did find the readimages package, but it also seemed problematic to install on both Windows on Mac, requiring extra software that it was itself unclear how to install. Is there some simpler solution I should be looking at? Are jpeg files so probematic I should be converting them to some other format and using a different package to read that? Thanks Robert Biddle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/reading-jpeg-images--tp24698332p24700340.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computing the radius of an arc
Nair, Murlidharan T mnair at iusb.edu writes: Hi!! I am interesting in computing the radius of an arc that best approximates a curve. Is there an R function that I can use to draw an arc? Nothing useful came up when I searched help.search. Does anyone have any suggestion to do this? Thanks ../Murli If you are looking for some formulas to calculate the center and radius of such a circle/arc, search for 'curvature' in Wikipedia or MathWorld. Formulas are a bit involved, and you will need to be able to compute first and second derivatives of the curve. In case the curve is given as a set of discrete points, you could apply some polynomial, spline, etc., interpolation. If you _then_ apply 'draw.arc', watch out to set the aspect ratio to 1, otherwise your circle may appear to be incorrect. library(plotrix) curve(sin(x), 0, pi, col=blue, asp=1) draw.arc(pi/2, 0, 1, deg1=45, deg2=135, col=red) Physical meaning of the curvature (and thus the approximation) is that an observer moving on the curve will feel the same centrifugal forces as if moving on the circle, at that moment. Regards Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R interface to PSwarm
Dear list: Being a bit unsatisfied with global optimization approaches in R such as SANN (in 'optim') or DEoptim, I looked for alternatives on the Web, such as PSwarm http://www.norg.uminho.pt/aivaz/pswarm/ or PIKAIA. There is an R interface to PSwarm (version 1.4) on its home page which I was not able to make run under Windows, getting the error message Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'character' when running the example. Therefore my questions: - Has someone successfully tried out PSwarm with R ? - Are there any experiences with PSwarm in general, that is could you find solutions with PSwarm that you did not find with another GA or with Rdonlp2, for example ? Regards Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automatic differentiation in R
Having given a lecture on Numerical Derivatives just a short time ago, I would like to mention the following: Many functions, especially in engineering, are not available as formulas built simply from arithmetical operators and elementary functions. They are provided as intricate procedures, as results from simulation runs, or as return values from optimization routines. In all these cases, symbolic differentiation is not a possibility at all. Still, for further processing there will be a need to differentiate these functions. As I understand the literature, Automated Differentiations (AD) is mostly meant for these kinds of applications(*). Another push for AD came from increased interest in the complex-step derivative approximation where first an analytic continuation -- often nontrivial -- has to be established, before the derivation can be computed. It may be different in statistical tasks. But a general automated differentiation in R may have to take these kinds of applications into account. Regards Hans Werner (*) See the Community Portal for AD www.autodiff.org and the definition of automated differentiation there as an example. nashjc wrote: [...] The clear issue in my mind is that users who need gradients/Jacobians for R want to be able to send a function X to some process that will return another function gradX or JacX that computes analytic derivatives. This has to be easy, which implies a very simple command or GUI interface. I am pretty certain the users have almost no interest in the mechanism, as long as it works. Currently, most use numerical derivatives, not realizing the very large time penalty and quite large loss in accuracy that can compromise some optimization and differential equation codes. I'll try to prepare a few examples to illustrate this and post them somewhere in the next few weeks. Time, as always, ... However, the topic does appear to be on the table. JN -- View this message in context: http://www.nabble.com/Automatic-differentiation-in-R-tp24602805p24634481.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r-project.org address blacklisted by anti-spam software
Dear List: An e-mail mentioning the r-project.org address and sent to a friend at a German university was considered spam by the local spam filter. Its reasoning: the URL r-project.org is blacklisted at uribl.swinog.ch resp. at antispam.imp.ch. I checked the list http://antispam.imp.ch/swinog-uri-rbl.txt [caution: long list] and indeed, there it was. Can anybody explain how or why the R project made its way onto the list? Is it reasonable to file a protest? --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R- NLP on R but ....
Rahul Varshney itsrahulvarshney at gmail.com writes: I'll appreciate the help on the following problem: I solved many Nonlinear programming problems with nonlinear constraintsRdonlp is working well but i am unable to get INTEGER data with nonlinear constraints in Rdonlp. Is it possible to get Integer Values of parameters in any package of R with nonlinear constraints. If I understand you correctly, you have mixed types of variables, continuous and integer (or binary?) ones, a non-linear (how much?) objective function and non-linear constraints (what kind?). Then your problem appears to be of MINLP-NL type and the short answer to your question is: No, there is no package providing such a solver. In the task Optimization view, under 'MIP' for mixed integer programs you will find lpsolve, Rglpk, or Rsymphony which are all linear solvers. (R)CPLEX can solve more general problems, but is commercial (and expensive). There is small hope that in the future free MIP solvers will be integrated as R packages, for instance from the COIN-OR initiative. --Hans Werner Rahul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent to Matlab's Ans
There is a discussion on this topic under the heading A shorter version of .Last.value on July 7, 2008, see for example http://www.nabble.com/A-shorter-version-of-%22.Last.value%22--to18322831.html#a18322831 --Hans Werner Stephane-18 wrote: Hi everyone, I was just wondering if there is an equivalent in R to the shortcut Ans in MatLab whereby I can use the previous result for the current command? This could save a lot of time when hacking in R itself, not from an editor. Thanks, Stephane [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Equivalent-to-Matlab%27s-%22Ans%22-tp24276877p24278620.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization and Linear Programming in R
Chris.Wilcox at csiro.au writes: Dear List, [...] We are looking for a solver that can deal with this nonlinear integer programming problem. We looked at a number of packages on the CRAN Task View: Optimization and Mathematical Programming, however, we have not been able to locate one that will suit our purposes. The essential features are we need to be able to write a nonlinear function for the objective (hopefully a self contained one as we need to include some data in it), we need to be able to use a binary decision (or parameter) vector, and we need to be able to use a constraint. Any suggestions as to packages or other software that will work for our problem would be much appreciated. What you desribe appears to be a 'Mixed Integer NonLinear Programming' (MINLP) problem. R may not be the right place to look for such kind of solvers. You can have a look into the NEOS Optimization Software Guide or into the Projects page of the COmputational INfrastructure for Operations Research (COIN-OR). The 'Bonmin' COIN-OR project could perhaps provide an appropriate solver for this problem. The RINO project on 'R-forge' plans to provide Rbonmin and Rlago packages, but this may take some time. Perhaps by reconsidering your problem approach you can (at least partly) linearize your target function or see if it can be made convex, etc.; it did not sound to be too complex. --Hans Werner P.S.: As far as I know, 'solnp' does not solve mixed integer problems. Thanks, Chris Wilcox and Greg Thonier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newtons method
Dear Ravi: Thanks for pointing out the homotopy methods. Coming from Mathematics I was always considering SINGULAR for such a task which is also providing results when the solution set is not isolated points, but an algebraic variety. For single points, homotopy methods appear to be an effective approach. I am wondering if it will be worth to integrate Jan Verschelde's free PHCpack algorithm, see http://www.math.uic.edu/~jan/, as a package into R -- if there would be enough interest. Best regards, Hans Werner Borchers Ravi Varadhan wrote: Uwe, John's comment about the difficulties with finding polynomial roots is even more forceful for a system of polynomials. There are likely numerous roots, some possibly real, and some possibly multiple. Homotopy methods are currrently the state-of-art for finding all the roots, but beware that they are very time-consuming. For locating the real roots, I have found that a relatively simple approach like multiple random starts works failrly well with a root-finder such as dfsane() in the BB package. However, I don't know of any tests to check whether I have found all the roots. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -- View this message in context: http://www.nabble.com/newtons-method-tp23498698p23535875.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MDL - Fayyad Irani Discretization
I have asked for this some years ago and did not got a positive hint. To my knowledge this has not changed since then. Brian Ripley proposed to use the 'rpart' algorithm for discretization. I think I applied the RELIEF-F method at that time and also wrote a simplified Fayyad-Irani function on my own. For RELIEF and other distretizing functions see the 'dprep' package. On CRAN there is only a version for Linux available, for Windows see the 'dprep' home page http://math.uprm.edu/~edgar/dprep.html. (But: I have not tried it out with R 2.9.0 !) Hans Werner Piotr Romański wrote: Hey, I'm looking for a function which provides a supervised Fayyad Irani discretization (MDL). I've already found RWeka which has such a function. But is there anything else? I'd prefer to use sth lighter in my own package. Regards, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/MDL---Fayyad---Irani-Discretization-tp23252045p23252977.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrined dependent optimization.
Kevin, sorry I don't understand this sentence: It has been suggested that I do a cluster analysis. Wouldn't this bet mepart way there? Let your products be numbered 1, ..., n, and let p(i) be the location where product i is stored (assigned to) in the warehouse. Then p is a permutation of the set 1, ..., n and the task is an assignment problem. A sale v is simply a subset of 1, ..., n and the contribution to the target function could, for example, be f_v(p) = max( p(v) ) - min( p(v) ) for each sale v, or whatever your evaluation criterion is. The total function to minimize will then be Sum( f_v) for all historical sales v. Because it is based on a distance, the whole task is called a quadratic assignment problem (and is certainly not linear). In the literature you will find algorithms and tools mentioned to solve such a task, and n = 20,000 is not that much nowadays. But the difference between n = 100 and n = 20,000 is huge for tools not specialized to the task. For the TSP, imagine two groups of cities very far apart. Then under some mild restrictions it would not be reasonable to go back and forth between these groups several times, and therefore the problem could be split into two independent subproblems. So the question of clustering belongs to the data preparation part. Whether such really different classes of sales exist in your data, only you can decide. In my experience, most of the time such preprocessing is not worth doing and will not speed up the optimization procedure considerably. Regards, Hans Werner rkevinburton wrote: It has been suggested that I do a cluster analysis. Wouldn't this bet mepart way there? Thank you for your response I am looking up the book now. Kevin Hans W. Borchers hwborch...@googlemail.com wrote: Just in case you are still interested in theoretical aspects: In combinatorial optimization, the problem you describe is known as the Quadratic (Sum) Assignment Problem (QAP or QSAP) and is well known to arise in facility and warehouse layouts. The task itself is considered hard, comparable to the Traveling Salesman Problem (TSP). I would clearly advise to turn to some specialized commercial software for solving it, otherwise you will very likely get stuck in suboptimal solutions miles away from the true optimum. And for a real commercial situation this may be disastrous. Regards, Hans Werner P.S.: See for instance Cela: The Quadratic Assignment Problem. Theory and Algorithms, Springer, 2000. (Partly available at books.google.com) -- View this message in context: http://www.nabble.com/Constrined-dependent-optimization.-tp22772520p22966416.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrined dependent optimization.
Just in case you are still interested in theoretical aspects: In combinatorial optimization, the problem you describe is known as the Quadratic (Sum) Assignment Problem (QAP or QSAP) and is well known to arise in facility and warehouse layouts. The task itself is considered hard, comparable to the Traveling Salesman Problem (TSP). I would clearly advise to turn to some specialized commercial software for solving it, otherwise you will very likely get stuck in suboptimal solutions miles away from the true optimum. And for a real commercial situation this may be disastrous. Regards, Hans Werner P.S.: See for instance Cela: The Quadratic Assignment Problem. Theory and Algorithms, Springer, 2000. (Partly available at books.google.com) rkevinburton wrote: I have decided to use this SANN approach to my problem but to keep the run time reasonable instead of 20,000 variables I will randomly sample this space to get the number of variables under 100. But I want to do this a number of times. Is there someone who could help me set up WINBUGS to repeat this optimization N times each time randomly picking 100 of the possible 20,000. Comments? Kevin -- View this message in context: http://www.nabble.com/Constrined-dependent-optimization.-tp22772520p22883010.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrined dependent optimization.
Image you want to minimize the following linear function f - function(x) sum( c(1:50, 50:1) * x / (50*51) ) on the set of all permutations of the numbers 1,..., 100. I wonder how will you do that with lpSolve? I would simply order the coefficients and then sort the numbers 1,...,100 accordingly. I am also wondering how optim with SANN could be applied here. As this is a problem in the area of discrete optimization resp. constraint programming, I propose to use an appropriate program here such as the free software Bprolog. I would be interested to learn what others propose. Of course, if we don't know anything about the function f then it amounts to an exhaustive search on the 100! permutations -- probably not a feasible job. Regards, Hans Werner Paul Smith wrote: On Sun, Mar 29, 2009 at 9:45 PM, rkevinbur...@charter.net wrote: I have an optimization question that I was hoping to get some suggestions on how best to go about sovling it. I would think there is probably a package that addresses this problem. This is an ordering optimzation problem. Best to describe it with a simple example. Say I have 100 bins each with a ball in it numbered from 1 to 100. Each bin can only hold one ball. This optimization is that I have a function 'f' that this array of bins and returns a number. The number returned from f(1,2,3,4) would return a different number from that of f(2,1,3,4). The optimization is finding the optimum order of these balls so as to produce a minimum value from 'f'.I cannot use the regular 'optim' algorithms because a) the values are discrete, and b) the values are dependent ie. when the variable representing the bin location is changed (in this example a new ball is put there) the existing ball will need to be moved to another bin (probably swapping positions), and c) each variable is constrained, in the example above the only allowable values are integers from 1-100. So the problem becomes finding the optimum order of the balls. Any suggestions? If your function f is linear, then you can use lpSolve. Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Constrined-dependent-optimization.-tp22772520p22782922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asking advice for Integer Programming packages
Looking into the 'Optimization' task view you will also identify the Rsymphony package. A short overview of these packages and a benchmark based on a test suite can be found in http://www.rmetrics.org/Meielisalp2008/Presentations/Theussl2.pdf. Different problems may have quite different running times applying the underlying procedures. So the question is not so much about ease of use, but how well they fit with the problem. Regards, Hans Werner dl7631 wrote: Dear everyone, I don't know much about Integer Programming but am afraid I am facing a problem that can only be solved via Integer Programming. I was wondering if those of you who have experience with it could recommend an R package. I found the following R packages: Rglpk glpk lpSolve lpSolveAPI Are there any others? Are some of them easier to use than others for a beginner? Any advice would be greatly appreciated! -- Dimitri Liakhovitski MarketTools, Inc. dimitri.liakhovit...@markettools.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/asking-advice-for-Integer-Programming-packages-tp22750572p22766608.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rdonlp2 -Query
Leo Guelman leo.guelman at gmail.com writes: Hi, Did anyone used this package? Could you please share your thought on it? What do you, exactly, mean with share your thought on it? It has its pros and cons, as always. Sure Rdonlp2 has been used, and it has been requested and discussed several times here on the list. It is mentioned on the 'Optimization' task view, too. An RSite search might help. Thanks! L. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Self-Organizing Map analysis
glaporta glaporta at freeweb.org writes: Dear list, I read the SOM package manual but I don't understand how to perform (for example) 1) the SOM analysis on Iris data 2) with a visualization similar to that of figure 7 in http://www.cis.hut.fi/projects/somtoolbox/package/papers/techrep.pdf In R, I used the following calls to create SOM maps similar to the ones in the literature (I don't find the SOM toolbox figures particularly inspiring): library(class) data(iris) n - 10 sg - somgrid(n, n, rectangular) df.som - batchSOM(df, sg, c(4,4,3,3,2,2,1,1)) windows(record=TRUE) for (i in 1:ncol(df)) { xy - matrix(df.som$codes[,i], n, n) image(xy) contour(xy, add=T) grid(col=gray) } Of course, there are not enough Iris data to generate a reasonable SOM grid. At least n - 30 is needed for a nice graph, and that means at least 1000 data points. Using a Binhex grid instead of a rectangular one will result in diagrams similar to those from the SOM toolbox. But then the 'image' call has to be a bit prepared for this. Regards, Hans Werner Any suggestion? Thanks in advance, Gianandrea __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R - MATLAB apply like function
ARDIA David david.ardia at unifr.ch writes: Dear all, I very often use the R function apply, for speedup purposes. I am now also using MATLAB, and would like to use the same kind of function. I have already asked MATLAB people, and the answer is : vectorize... but of course, this is not always possible. So, instead of using a FOR loop all the time, I tried using the bsxfun. The only other Matlab functions I can come up with of are 'arrayfun', 'structfun', and 'cellfun'. For instance, arrayfun(@(x)isequal(x.f1, x.f2), S) applies an anonymous function to a structure array. But think of it, loops in Matlab are much faster than in R -- especially since version 2008a --, so the need for avoiding them is not as big. Regards, Hans Werner So you R people, who might also use MATLAB, do you have any solution? Are you also nervous about this fact (maybe also because of this R-MATLAB question). Thanks for your help, Dave [[alternative HTML version deleted]] __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset Regression Package
Alex Roy alexroy2008 at gmail.com writes: Dear all , Is there any subset regression (subset selection regression) package in R other than leaps? Lars and Lasso are other 'subset selection' methods, see the corresponding packages 'lars' and 'lasso2' and its description in The Elements of Statistical Learning. Also, 'dr', Methods for dimension reduction for regression, or 'relaimpo', Relative importance of regressors in linear models, can be considered. Thanks and regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outlier Detection for timeseries
Pele drdionc at yahoo.com writes: Hello R users, Can someone tell if there is a package in R that can do outlier detection that give outputs simiilar to what I got from SAS below. Many thanks in advance for any help! I guess you are talking about the OUTLIER procedure in SAS that attempts to detect 'additive outliers' and 'level shifts' in a 'response' series, the second following Jong Penzer's Diagnosing shocks in time series. I have not come across this method in R, but you might have a look into the 'robfilter' (Robust Time Series Filters) package with functions like 'dw.filter', 'adore.filter', or 'wrm.filter', see for instance dw.filter is suitable for extracting low frequency components (the signal) from a time series which may be contaminated with outliers and can contain level shifts. For this, moving window techniques are applied. If your time series is actually a response, you might prefer to look at the series of residuals instead. Outlier Details Approx Chi- Prob ObsTime ID Type Estimate Square ChiSq 12 12.00Additive 2792544.6 186.13 .0001 13 13.00Additive 954302.1 21.23 .0001 15 15.00Shift 63539.3 9.060.0026 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general inverse solver?
Gabor Grothendieck ggrothendieck at gmail.com writes: Yacas was completely rewritten in java (Ryacas interfaces to the C version) since the Ryacas project started so I would not exactly characterize yacas as dead. The work that is going on in yacas may not have high visibility but that does not mean there is none. Also while Maxima is more sophisticated in terms of algorithms, yacas is actually more sophisticated from the viewpoint of its language which borrows ideas from both imperative and prolog programming and its interfaces are more sophisticated (it is one of the few CAS systems that developed an OpenMath interface) and its socket server is used by the Ryacas interface. yacas can also translate math expressions to TeX and do exact arithmetic. Also to put this in the correct context, yacas does seem capable of answering the majority of questions that are posed on r-help that need a CAS in the answer. From a practical viewpoint it does seem to have the facilities that are most often needed. The Ryacas vignette has a survey of some of its algebra capabilities. I do not agree with this statement. Looking back into some of the requests for symbolic computations on this help list, [R]Yacas showed off weaknesses such as: - extremely slow in high-precision arithmetic - not solving very high-dimensional polynomials - does not simplify expressions in several variables - does not solve systems of equations except linear ones - does not symbolically solve medium to complex integrals - has very few special functions - therefore differentiation is unsatisfying From a statistics point of view or application this may not be so relevant, but considering the requests it illustrates that Yacas is not providing necessary CAS features. Especially the missing capabilities in symbolic integration can be painful in a statistics environment. Many times Ryacas was pointed out, it was actually not solving the problem. The examples in the Ryacas vignette are -- in my opinion -- trivial ones and do not display advanced capabilities. This is not meant to diminish the efforts invested in the Yacas software over the years. I know a bit how difficult it is to write software for symbolic computations. But in my opinion it also is evidence that a more advanced computer algebra module could be very valuable for extending and improving the R system. Hans W. Borchers That being said, without taking away from yacas there is work going on to interface R to a second CAS. On Tue, Feb 10, 2009 at 2:33 AM, Hans W. Borchers hwborchers at googlemail.com wrote: I know that Ryacas is promoted here whenever requests about symbolic algebra or calculus appear on the R-help list. But to say the truth, Yacas itself is a very very limited Computer Algebra System and looking onto its home page it appears the development will stop or has stopped anyway. It would be fair to clearly state that there is no R package to solve somewhat more involved symbolic mathematical problems. One could then point the requestor to one of the open source Computer Algebra Systems (CAS) such as Maxima or Axiom. Interestingly, the free Math Toolbox Euler by Grossmann has integrated Maxima into its numerical environment in a way that is really useful for numerical and symbolic computations. I could imagine that in a similar way Maxima can be integrated into R bringing the full power of computer algebra to the R community. Hans W. Borchers ABB Corporate Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general inverse solver?
Gabor Grothendieck ggrothendieck at gmail.com writes: I am not sure what the point of all this is. The point of all this is: 1. Some users of R here on the list apparently would like to have more powerful CAS functionalities than Yacas can provide. 2. Many of the solution hints to Ryacas on this list were simply not solving the resp. problem, so the 80/20 statement of yours may be downright wrong. 3. Maxima does incorporate a socket server and can be integrated into other systems, see EULER (do you never listen?). What are you mourning about? Of course, it was great that you made available Yacas for R. But others have the right to discuss their wishes here w/o being 'slammed' every time with questionable hints to Ryacas. I agree arguments have been exchanged and we should stop here. After this discussion, I wish someone would think about an Rmaxima package (along the lines of EULER, e.g.), I cant do it, unfortunately. Hans Werner Borchers This is an R list, not a CAS list. The recommendation to use yacas is based on the fact that there is an interface between R and yacas. There is no interface between R and Maxima so Maxima is not in the running. Anyone who has used Maxima knows that is an impressive piece of software but that's not the point. Regarding, why there is no interface to Maxima, its because its harder to interface to Maxima than yacas. There are two problems here: 1. Maxima does not incorporate a socket server as far as I know. You would have to write it and that may or may not need an in depth understanding of Maxima to do so but in any case represents work. With yacas you don't have to write a server since yacas itself already contains a server. Just run: yacas --server .,. and the server side is done. (In the case of yacas it would also be possible to use its C interface for an in-process interface or presumably the java interface of the new java version of yacas. With Maxima it would be more problematic since its written in Lisp.) 2. Once you have created some sort of communications channel then what? If you want more than a crude interface that passes an unprocessed character string to the CAS and then passes one back then you will want to translate between R and the CAS. With yacas, OpenMath facilitates this greatly. The end result is that its more work to meaningfullly interface with Maxima than with yacas yet yacas satisfies the majority of needs of a CAS. There is the minority who need more powerful algorithms but I think it was a reasonable step to handle the 80% that can be accommodated most easily first by using yacas. The remaining 20%, which is what the other respondents to this thread are discussing will in part be addressed in the future possibly, in part, by the second project I am working on now. Perhaps some would quibble with the 80/20 and if you are in the 20 it probably seems like 100 but we will never settle that question definitively. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general inverse solver?
I know that Ryacas is promoted here whenever requests about symbolic algebra or calculus appear on the R-help list. But to say the truth, Yacas itself is a very very limited Computer Algebra System and looking onto its home page it appears the development will stop or has stopped anyway. It would be fair to clearly state that there is no R package to solve somewhat more involved symbolic mathematical problems. One could then point the requestor to one of the open source Computer Algebra Systems (CAS) such as Maxima or Axiom. Interestingly, the free Math Toolbox Euler by Grossmann has integrated Maxima into its numerical environment in a way that is really useful for numerical and symbolic computations. I could imagine that in a similar way Maxima can be integrated into R bringing the full power of computer algebra to the R community. Hans W. Borchers ABB Corporate Research Postscript The Euler Mathematical Toolbox is a powerful, versatile, and open source software for numerical and symbolic computations ... Euler supports symbolic mathematics using the open algebra system Maxima. http://mathsrv.ku-eichstaett.de/MGF/homes/grothmann/euler/ Gabor Grothendieck wrote: The forms of equations are limited but its not limited to just one: library(Ryacas) Loading required package: XML x - Sym(x) y - Sym(y) Solve(List(x+y == 2, x-y == 0), List(x, y)) [1] Starting Yacas! expression(list(list(x == 2 - y, y == 1))) On Mon, Feb 9, 2009 at 7:45 PM, Carl Witthoft c...@witthoft.com wrote: Gabor G a ecrit: Check out the Ryacas package. There is a vignette with some examples. Which led me to the manuals for yacas itself. I'm guessing there may be a way to use yacas' AND construct to combine a few equations and then hope the Newton Solver can work with that, but it's not clear that will work. TK!Solver is nice because you aren't limited to linear equations, nor to equations which fit into a matrix structure, and because it's legal to have more than one unknown to be back-solved (assuming the problem is not under- or over-defined, of course). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/general-inverse-solver--tp21902788p21928972.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Event sequence analysis
Jean-Pierre Müller jean-pierre.mueller at unil.ch writes: Maybe TraMineR ? http://mephisto.unige.ch/traminer HTH, Thanks for pointing out TraMineR; it is a very interesting package on sequences of events and states in social science. Unfortunately, it appears not provide the kind of windowing or counting of subsequences that I was asking for. Therefore, I am still looking for implementations of procedures for event sequence analysis as provided in some data mining tools and applications, see for example the article cited below. Any help would be very much appreciated. Thanks, Hans Werner Borchers Le 2 févr. 09 à 13:31, Hans W. Borchers a écrit : Dear R help, I am analyzing sequences of events described by time and a unique event tag. And I am searching for recurring patterns where patterns have to show up in a certain time window, e.g. 5 or 10 minutes. Of course, inbetween these events other events may occur. I have applied basket analysis approaches like apriori or 'frequent item set' algorithms with interesting results but these methods do not take into account the exact succession of events. I also looked into the 'Generalized Sequential Pattern' function of Weka, but the implementation in Weka does not allow for a time window (as far as I understand). Are there other sequence analysis implementations available in R? -- For instance in the realm of the 1997 paper Discovery of frequent episodes in event sequences by H. Mannila et al. Please no BioConductor hints as they are meaning something different with (genetic) sequence analysis. Very best, Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Event sequence analysis
Dear R help, I am analyzing sequences of events described by time and a unique event tag. And I am searching for recurring patterns where patterns have to show up in a certain time window, e.g. 5 or 10 minutes. Of course, inbetween these events other events may occur. I have applied basket analysis approaches like apriori or 'frequent item set' algorithms with interesting results but these methods do not take into account the exact succession of events. I also looked into the 'Generalized Sequential Pattern' function of Weka, but the implementation in Weka does not allow for a time window (as far as I understand). Are there other sequence analysis implementations available in R? -- For instance in the realm of the 1997 paper Discovery of frequent episodes in event sequences by H. Mannila et al. Please no BioConductor hints as they are meaning something different with (genetic) sequence analysis. Very best, Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get solution of following polynomial?
RON70 ron_michael70 at yahoo.com writes: Hi Ravi, Thanks for this reply. However I could not understand meaning of vectorizing the function. Can you please be little bit elaborate on that? Secondly the package polynomial is not available in CRAN it seems. What is the alternate package? Thanks, Ravi means 'PolynomF' which is an improved version of the old polynomial package. You do not need to recreate the polynomial from points. Instead, calculate the exact polynomial: library(PolynomF) z - polynom() p11 - 1 - A1[1,1]*z - A2[1,1]*z^2 - A3[1,1]*z^3 - A4[1,1]*z^4 # ... p - p11*p22 - p12*p21 There is probably a shorter way to generate these four polynoms p11, ..., p22. Anyway, the result is p # 1 - 1.18*x + 0.2777*x^2 - 0.2941*x^3 - 0.1004*x^4 + 0.3664*x^5 - # 0.0636*x^6 + 0.062*x^7 - 0.068*x^8 solve(p) # [1] -1.365976+0.00i -0.737852-1.639581i -0.737852+1.639581i # [4] -0.012071-1.287727i -0.012071+1.287727i 1.00+0.00i # [7] 1.388794-0.281841i 1.388794+0.281841i and the real solutions are 1.0 and -1.365976 ! Regards, Hans Werner Ravi Varadhan wrote: Hi, You can use the polynomial package to solve your problem. The key step is to find the exact polynomial representation of fn(). Noting that it is a 8-th degree polynomial, we can get its exact form using the poly.calc() function. Once we have that, it is a simple matter of finding the roots using the solve() function. require(polynomial) a - c(-0.07, 0.17) b - c(1, -4) cc - matrix(c(0.24, 0.00, -0.08, -0.31), 2) d - matrix(c(0, 0, -0.13, -0.37), 2) e - matrix(c(0.2, 0, -0.06, -0.34), 2) A1 - diag(2) + a %*% t(b) + cc A2 - -cc + d A3 - -d + e A4 - -e # I am vectorizing your function fn - function(z) { sapply(z, function(z) { y - diag(2) - A1*z - A2*z^2 - A3*z^3 - A4*z^4 det(y) }) } x - seq(-5, 5, length=9) # note we need 9 points to exactly determine a 8-th degree polynomial y - fn(x) p - poly.calc(x, y) # uses Lagrange interpolation to determine polynomial form p 1 - 1.18*x + 0.2777*x^2 - 0.2941*x^3 - 0.1004*x^4 + 0.3664*x^5 - 0.0636*x^6 + 0.062*x^7 - 0.068*x^8 # plot showing that p is the exact polynomial representation of fn(z) pfunc - as.function(p) x1 -seq(-5, 5, length=100) plot(x1, fn(x1),type=l) lines(x1, pfunc(x1), col=2, lty=2) solve(p) # gives you the roots (some are, of course, complex) Hope this helps, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Execution of a .sce file through R
ykank spicyankit4u at gmail.com writes: Dear R Users Does some has any idea about how to execute a scilab file(.sce file) from the Terminal in R. Any kind of guidance would be highly welcomed and appreciated. Have you looked into your Scilab directory? In mine -- under Windows XP and Scilab version 5.0.3 -- there is no executable file 'scilab.exe'. The file to call is 'Scilex.exe' (and 'WScilex.exe' for the GUI) which can also easily been seen by following the path of the Scilab link in the Start menu. The following call works for me: system(path-to-scilab/bin/Scilex -f filename.sce) where I have to end the 'sce' file with a quit statement. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting ISO week
Gabor Grothendieck ggrothendieck at gmail.com writes: format(d, %U) and format(d, %W) give week numbers using different conventions. See ?strptime Gabor, the results of format(aDate, W) appear to be incorrect anyway, see: format(as.Date(2008-01-01), %W) #- 00 There is never a week 0, this should be week 1. format(Sys.Date(), %W)#- 49 but my business calendar says today's (Dec. 11, 2008) week is week 50 which is what Brian Ripleys proposed 'strftime(x, %V)' returns. There could be a format %E (not used up to now) for returning a correct week number according to the European standard. Yours, Hans Werner On Thu, Dec 11, 2008 at 7:43 AM, Gustaf Rydevik gustaf.rydevik at gmail.com wrote: Hi all, Is there a simple function already implemented for getting the ISO weeks of a Date object? I couldn't find one, and so wrote my own function to do it, but would appreciate a pointer to the default way. If a function is not yet implemented, could the code below be of interest to submit to CRAN? Best Regards, Gustaf ... [rest deleted] __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting ISO week
Gabor Grothendieck ggrothendieck at gmail.com writes: According to the definition in ?strptime (which is not the same as the ISO definition): format(x, %W) returns Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1). The UK convention. The first day of 2008 is a Tuesday which means that 2008 starts in week 0. Yes I read that but it is still misleading and -- I think -- incorrect. See www.dateandtime.org/calendar to find out that this is week 50 even in the UK. We would have had a lot of misplaced business meetings in our company if the week numbers in Great Britain, Germany, and Sweden would actually be different. Hans Werner ... [rest deleted] __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Harsh singhalblr at gmail.com writes: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. ... I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. I doubt that you will find what you are longing for, but: There is a Windows version available at the Homepage of the drep package at http://math.uprm.edu/~edgar/dprep.html. This version 2.0 can be loaded without errors into R 2.8.0 though it appears not to be fully compliant with the tests on CRAN. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to solve following equation?
I need to solve a equation like this : a = b/(1+x) + c/(1+x)^2 + d/(1+x)^3 where a,b,c,d are known constant. Is there any R-way to do that? Multiplying this expression with (1+x)^3 leads to a polynomial equation. I would certainly recommend the 'PolynomF' package here: # install.packages(PolynomF) library(PolynomF) options(digits=16) x - polynom() a - b - c - d - 1 p - a*(1+x)^3 - b*(1+x)^2 - c*(1+x) - d p # -2 + 2*x^2 + x^3 solve(p) # [1] -1.419643377607080-0.6062907292072i # -1.419643377607080+0.6062907292072i # [3] 0.839286755214161+0.0i The solution x0 = 0.839286755214161 is correct up to the last digit, as can be verified by using a computer algebra system. This also shows that Ryacas is quite exact in this task. Hans Werner Gabor Grothendieck ggrothendieck at gmail.com writes: Assume a = 1. If not set b = b/a, etc. Now use (1) uniroot f - function(x) b + c/(1+x) + d/(1+x)^2 - 1 - x uniroot(f, 0:1) $root [1] 0.8392679 $f.root [1] 3.049818e-05 $iter [1] 3 $estim.prec [1] 6.103516e-05 or multiply through by 1+x and subtract 1 from both sides giving x = b + c/(1+x) + d/(1+x)^2 - 1 and iterate that. a - b - c - d - 1 x - 0 for(i in 1:25) { + x - b + c/(1+x) + d/(1+x)^2 - 1 + print(x) + } [1] 2 [1] 0.444 [1] 1.171598 [1] 0.6725419 [1] 0.9553676 [1] 0.7729558 [1] 0.8821595 [1] 0.8135892 [1] 0.8554268 [1] 0.829437 [1] 0.8454056 [1] 0.835527 [1] 0.8416126 [1] 0.837854 [1] 0.8401717 [1] 0.838741 [1] 0.8396236 [1] 0.839079 [1] 0.839415 [1] 0.8392076 [1] 0.8393356 [1] 0.8392566 [1] 0.8393053 [1] 0.8392753 [1] 0.8392938 On Mon, Dec 1, 2008 at 9:47 PM, RON70 ron_michael70 at yahoo.com wrote: I need to solve a equation like this : a = b/(1+x) + c/(1+x)^2 + d/(1+x)^3 where a,b,c,d are known constant. Is there any R-way to do that? Thanks in advance -- View this message in context: http://www.nabble.com/How-to-solve-following-equation--tp20785063p20785063.html Sent from the R help mailing list archive at Nabble.com. __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization problem
Why not use one of the global optimizers in R, for instance 'DEoptim', and then apply optim() to find the last six decimals? I am relatively sure that the Differential Evolution operator has a better chance to come near a global optimum than a loop over optim(), though 'DEoptim' may be a bit slow (only for quite large numbers of parameters). Regards, Hans Werner Mike Prager wrote: tedzzx [EMAIL PROTECTED] wrote: If I want to find out the globle minia, how shoul I change my code? I sometimes use optim() within a loop, with random starting values for each iteration of the loop. You can save the objective function value each time and pick the best solution. Last time I did that, I ran it 100 times. That procedure does not guarantee finding the global minimum. However, it does make it *more likely* to find the global minmum *within the range of your starting values*. Often, I make a boxplot of the various results. If they don't show a strong mode, there is a data or model problem that needs to be addressed. For example, the solution may be poorly defined by the data, or the model may be specified with confounded parameters. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/optimization-problem-tp20730032p20784256.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Examples of advanced data visualization
Tom Backer Johnsen backer at psych.uib.no writes: [...] The question is interesting, but what I have a somewhat negative reaction to is the next passage: Please answer to my e-mail address. In case enough interesting material comes up, I will enter a summary here. It is nice that you are willing to summarize whatever appears, but somewhat arrogant in my eyes. There might be things appearing that you do not regard as of first interest that might be of interest to others. Therefore, the all parts of the discussion or responses should be public as well. The response of David Winsemius pointing at (among other things) at the presentation of Rosling at TED is in my eyes a very good start. In other words, I therefore suggest that the list ignores the last paragraph in the question from you. Hans Werner Borchers ABB Corporate Research Tom Tom is right here. What I should have said / intended to say was something like You can _also_ answer to my e-mail address I feel sorry that this misstatement caused a distraction from my request. Coming back to the original question: We all know the many pages about R graphics and its numerous features and skills. Therefore, I am more interested in data visualizations not yet implemented or made available in R. I am also wondering if the R Wiki would be a better place to publish summaries on topics discussed here. On the mailing list, summaries are forgotten within one or two months time, only to be retrieved in specific searches. Comments on any of the issues mentioned are welcome. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Examples of advanced data visualization
Dear R-help, I am looking for ideas and presentations of new and advanced data visualization methods. As an example of what I am searching for, the 'Many Eyes' pages at http://manyeyes.alphaworks.ibm.com/manyeyes/ may provide a good paradigm. I would be interested even if it will not be easy to implement such examples in R, e.g. because of the interactive nature of these graphical displays. Please answer to my e-mail address. In case enough interesting material comes up, I will enter a summary here. Hans Werner Borchers ABB Corporate Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which data structure in R can mimic hash in PERL best?
Dear all, Which data structure in R can mimic hash in PERL? I'd like to set up a lookup table, which could be accomplished by HASH if using PERL. Which of the data structures in R is the most efficient for lookup table? Thanks for your help. Best regards, Leon The regular answer to this is named arrays/vectors or environments and has been given several times here on R-help. Unfortunately, everybody is applying a different terminology such as 'maps', 'hashes', 'tables' or 'dictionaries', etc., so it's difficult to search for those entries. Below I enclose a solution that I have written some month ago. It is given in pythonic language, but you can easily translate it into the corresponding Perl terms. One drawback is that one cannot use numbers as keys, only keys following the naming conventions for variable names are allowed. Hans Werner Borchers ABB Corporate Research #-- Define functions on Hash Tuples (Python alike) - def.h - function() new.env(hash=TRUE) len.h - function(dict) length(ls(envir=dict)) set.h - function(key, val, dict) assign(key, val, envir=dict) get.h - function(key, dict, default=NULL) { if (exists(key, envir=dict)) { get(key, dict) } else { default } } has_key - function(key, dict) exists(key, envir=dict) keys.h - function(dict) ls(envir=dict) items.h - function(dict) as.list(dict) values.h - function(dict, mode='character') { l - as.list(dict) n - length(l) if (n==0) invisible(NULL) v - vector('character', n) for (i in 1:n) v[i] - l[[i]] if (mode=='numeric') v - as.numeric(v) return(v) } del.h - function(key, dict) { if (exists(key, envir=dict)) { val - get.h(key, dict) rm(list=c(key), envir=dict) } else { val - NULL } invisible(val) } clear.h - function(dict) { rm(list=keys.h(dict), envir=dict) } #--- -- View this message in context: http://www.nabble.com/Which-data-structure-in-R-can-mimic-hash-in-PERL-best--tp20515707p20517058.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strsplit (regex)
stephen sefick ssefick at gmail.com writes: #how do I break these up into first two letters (RM), number, and then the last part #is there an easily accessible regex tutorial on the internet? For regular expressions, the perl man pages at http://perldoc.perl.org/ perlre.html are quite good and present the essentials in condensed form, but still very useful. (Some constructs may not work outside of Perl.) A more elaborate tutorial is to be found at Regular-Expressions.info, i.e. http://www.regular-expressions.info/. Of course, there are many, many more, see the Open Directory for one long list. And each programming language has its own page on regular expressions. At the Regular Expression Library http://regexlib.com you can search for and copy regular expressions, for example there are 20 patterns returned when searching for regular expressions on 'floats'. Hans Werner Borchers v = (structure(1:122, ... thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ttda and text-mining
LE PAPE Gilles lepape.gilles at neuf.fr writes: The ttda package was devoted to text-mining. It seems to be no more available. Did the name change ? Are other packages devoted to text-mining? The 'ttda' package is deprecated as is noted on the Task View Natural Language Processing. The old code is still available if you follow the link. Still I would propose to convert to 'tm' instead. Another possibility is the 'ReadMe' package at http://gking.harvard.edu/readme/ for automated content analysis. Hans Werner Borchers Many thanks Gilles [[alternative HTML version deleted]] __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do knn regression?
This is a summary of discussions between Shengqiao Li and me, entered here as a reference for future requests on knn regression or missing value imputation based on a nearest neighbor approach. There several functions that can be used for 'nearest neighbor' classification such as knn, knn1 (in package class), knn3(caret), kknn(kknn), ipredknn(ipred), sknn(klaR), or gknn(cba). To utilize these functions for 'nearest neighbor' regression would be difficult. There is actually just one knn-like functions that can be applied to continuous variables: kknn(kknn) uses a formula and looks at the type of the target variable: if the target variable is continuous will return a regression result for each row in the learning set And two implementations of functions that simply return the indices and distances of k nearest neighbors for further processing: ann(yaImpute) constructs kd- or bd-trees to find k nearest neighbors and returns indices and distances of those neighbors (it may kill the whole R process when matrices are too big) [Remark: Watch out, default distance is sum of squares] knnFinder(knnFinder) constructs a kd-tree to find the k nearest neighbors; has too many bugs and quirks to make it almost unusable; not maintained anymore (perhaps should be removed from CRAN) The other approach is to use a distance function and sort 'manually' to find the nearest neighbors and their values for the target variable. 'dist' itself is not really appropriate as it can only be applied to _one_ matrix where here we need something like dist(A, B). Combining A and B into one matrix is often forbidden as it needs too much memory. dists(cba) computes a distance matrix between rows of two matrices can be a bit slow for very big matrices (slower than 'dist') [Rem: default distance is square root of sum of squares] I would appreciate to hear from you when I missed something. Hans Werner Borchers ABB Corporate Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do knn regression?
Shengqiao Li shli at stat.wvu.edu writes: Hello, I want to do regression or missing value imputation by knn. I searched r-help mailing list. This question was asked in 2005. ksmooth and loess were recommended. But my case is different. I have many predictors (p20) and I really want try knn with a given k. ksmooth and loess use band width to define neighborhood size. This contrasts to knn's variable band width via fixing a k. Are there any such functions I can use in R packages? The R package 'knnFinder' provides a nearest neighbor search based on the approach through kd-tree data structures. Therefore, it is extremely fast even for very large data sets. It returns as many neighbors as you need and can also be used, e.g., for determining distance-based outliers. Hans Werner Borchers ABB Corporate Research Your help is highly appreciated. Shengqiao Li __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find a shift between two curves or data sets
Dear Sébastien, identifying similarity in curves or time series is one of the main tasks in the quite recent field of 'Functional Data analysis' (FDA). See the 2005 book by Silverman from Springer Verlag or the corresponding Web page at url{http://www.psych.mcgill.ca/misc/fda/}. The 'fda' package on CRAN contains functions developed by Silverman and Ramsey and described in that book. You may also be interested in the 'dtw' package that provides functions for comparing times series under time 'warping', i.e., distortions along the time axis. Hans Werner Borchers ABB Corporate Research Sébastien Durand wrote: Hello, I have a issue here! I need to find the offset or shift between two data sets. Each data set does not start nor end at the same time and dont even have the same sampling interval (which by the way isn't constant in any of the data set). It must be known that the data expressed in both the data set are comming from the same sensor so they should be the same! What I am looking for is a way to minimize the differences between the two curves by applying an x offset. Would any body have a clue how to perform such action, I simply have no idea were to start. Thank you very very much. S. -- View this message in context: http://www.nabble.com/How-to-find-a-shift-between-two-curves-or-data-sets-tp19561272p19567696.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convex optimization in R?
Hesen Peng hesen.peng at gmail.com writes: Hi my R buddies, I'm trying to solve a specific group of convex optimization in R. The admissible region is the inside and surface of a multi-dimensional eclipse area and the goal function is the sum of absolution values of the variables. Could any one please tell me whether there's a package in R to do this? Thank you very much, To my knowledge there does not exist a designated R package for convex optimization. Also, in the Optimization task view the AMS nomenclature 90C25 for Convex programming (CP) is not mentioned. On the other hand, this task view may give you some ideas on how to solve your problem with one of the available optimization packages. For instance, a problem including sums of absolute values can be modeled as a linear program with mixed integer variables (MILP). There is a free module for 'disciplined' convex optimization, CVX, that can be integrated with Matlab or Python. Hopefully, there will be a CVX R package in the future (as has been announced/promised). Hans Werner Borchers ABB Corporate Research Best wishes, -- Hesen Peng http://hesen.peng.googlepages.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exponential smoothing?
Öhagen Patrik Patrik.Ohagen at mpa.se writes: Dear List, I have used all my resources (i.e. help.search) and I still havn't been able to figure out if there is an Exponential Smoothing command in R. A few weeks ago the book Forecasting with Exponential Smoothing by Hyndman et al. has appeared http://www.exponentialsmoothing.net/. In connection with this book the 'forecasting' bundle is available from the CRAN package repository, containing functions and data sets described in the book. // Hans Werner Borchers Thank you in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nonlinear constrained optimization
Paul Smith phhs80 at gmail.com writes: Up to my best knowledge, R cannot deal with optimization problems with nonlinear constraints, unless one uses the penalty method. Outside R, Ipopt and Algencan can solve problems like yours, but one needs to program in AMPL and/or C/Fortran. Paul Please have a look at the 'Rdonlp2' package at http://arumat.net/Rdonlp2/. It provides non-linear optimization with nonlinear constraints. For more information on advanced optimization procedures available in R take a short trip to the Optimization Task View. // Hans Werner Borchers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exporting adaBoost model
One way to port these kinds of models between applications is the Predictive Model Markup Language (PMML). The R package 'PMML' supports linear regression, rpart, SVM, and others, not adaBoost. On the other side, not even the Python machine learning library Orange does have an import function for PMML. Perhaps a more attractive possibility is to call R functions from Python through the 'Rpy' http://rpy.sourceforge.net/ Python interface to R. You could send data from Python to R for being handled by adaBoost and get back the results. Of course, R and Python need be installed on your machine and you cannot generate a single executable., // Hans Werner Borchers Bob Flagg wrote: Dear all, I'm using adaBoost from the ada package to build a classification model. After training the model in R I'd like to use it in a Python application. Is it possible to export the model in some way to make translating into python easier? Any help would be greatly appreciated. Thanks. Bob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/exporting-adaBoost-model-tp19005415p19026563.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FastICA
Maura E Monville maura.monville at gmail.com writes: Is the FastICA R implementation as good as the MatLab Implementation ? I would appreciate talking to someone who has used FastICA for R. The fastICA packages for Matlab and R (and there is even a version for Python) have a common origin at the Helsinki University of Technology. I regularly use Matlab and R, not seeing much of a difference in packages like these. // Hans Werner Borchers Thank you very much. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FastICA
Prof Brian Ripley ripley at stats.ox.ac.uk writes: On Tue, 12 Aug 2008, someone with no signature wrote: Maura E Monville maura.monville at gmail.com writes: Is the FastICA R implementation as good as the MatLab Implementation ? I would appreciate talking to someone who has used FastICA for R. The fastICA packages for Matlab and R (and there is even a version for Python) have a common origin at the Helsinki University of Technology. Have you actually looked at the R one? The code has no connection with Helsinki University of Technology. 'Credit where credit is due' and all that. Oh yes, you are right and I am sorry. I started working with fastICA for Matlab and referring to the page http://www.cis.hut.fi/projects/ica/fastica/ and its uncommented link to fastICA in R I always assumed it uses the same code base (never looking at 'credits'). Well, then you are the right person to answer this question. Matlab and R, not seeing much of a difference in packages like these. // Hans Werner Borchers Or quite possibly someone else using Mr Borchers name. Not really. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FastICA
I may not have been as wrong as Prof. Ripley suggested when I wrote The fastICA packages for Matlab and R (...) have a common origin at the Helsinki University of Technology. Please consider the following lines from the 'fastICA' help page (?fastICA): FastICA algorithm Description: This is an R and C code implementation of the FastICA algorithm of Aapo Hyvarinen et al. (URL: http://www.cis.hut.fi/aapo/) to perform Independent Component Analysis (ICA) and Projection Pursuit. So the C code base is not the same, I guess, but The code has no connection with Helsinki University of Technology does not seem to be totally correct. // Hans Werner Borchers Prof Brian Ripley wrote: On Tue, 12 Aug 2008, someone with no signature wrote: Maura E Monville maura.monville at gmail.com writes: Is the FastICA R implementation as good as the MatLab Implementation ? I would appreciate talking to someone who has used FastICA for R. The fastICA packages for Matlab and R (and there is even a version for Python) have a common origin at the Helsinki University of Technology. I regularly use Have you actually looked at the R one? The code has no connection with Helsinki University of Technology. 'Credit where credit is due' and all that. Matlab and R, not seeing much of a difference in packages like these. // Hans Werner Borchers Or quite possibly someone else using Mr Borchers name. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- View this message in context: http://www.nabble.com/FastICA-tp18928778p18940982.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re cursive root finding
As your curve is defined by its points, I don't see any reason to artificially apply functions such as 'uniroot' or 'optim' (being a real overkill in this situation). First smooth the curve with splines, Savitsky-Golay, or Whittacker smoothing, etc., then loop through the sequence of points and compute the gradient by hand, single-sided, two-sided, or both. At the same time, mark those indices where the gradient is zero or changes its sign; these will be the solutions you looked for. With your example, I immediate got as maxima or minima: x1 = 1.626126 x2 = 4.743243 x3 = 7.851351 // Hans Werner Borchers Any comments? Maybe the problem was not clear or looked too specific. I'll add a more bare bones example, if only to simulate discussion: x - seq(1,10,length=1000) set.seed(11) # so that you get the same randomness y - jitter(sin(x),a = 0.2) values - data.frame(x= x, y = y) findZero - function (guess, neighbors, values) { smooth - smooth.spline(values$x, values$y) obj - function(x) { (predict(smooth, x)$y) ^2 } minimum - which.min(abs(values$x - guess)) optimize(obj, interval = c(values$x[minimum - neighbors], values$x[minimum + neighbors])) # uniroot could be used instead i suppose } test - findZero(guess = 6 , neigh = 50, values = values) plot(x,y) abline(h=0) abline(v=test$minimum, col=red) Now, I would like to find all (N=)3 roots, without a priori knowledge of their location in the interval. I considered several approaches: 1) find all the numerical values of the initial data that are close to zero with a given threshold. Group these values in N sets using cut() and hist() maybe? I could never get this to work, the factors given by cut confuse me (i couldn't extract their value). Then, apply the function given above with the guess given by the center of mass of the different groups of zeros. 2) apply findZero once, store the result, then add something big (1e10) to the neighboring points and look for a zero again and repeat the procedure until N are found. This did not work, I assume because it does not perturb the minimization problem in the way I want. 3) once a zero is found, look for zeros on both sides, etc... this quickly makes a complicated decision tree when the number of zeros grows and I could not find a clean way to implement it. Any thoughts welcome! I feel like I've overlooked an obvious trick. Many thanks, baptiste On 7 Aug 2008, at 11:49, baptiste auguie wrote: Dear list, I've had this problem for a while and I'm looking for a more general and robust technique than I've been able to imagine myself. I need to find N (typically N= 3 to 5) zeros in a function that is not a polynomial in a specified interval. The code below illustrates this, by creating a noisy curve with three peaks of different position, magnitude, width and asymmetry: x - seq(1, 10, length=500) exampleFunction - function(x){ # create some data with peaks of different scaling and widths + noise fano - function (p, x) { y0 - p[1] A1 - abs(p[2]) w1 - abs(p[3]) x01 - p[4] q - p[5] B1 - (2 * A1/pi) * ((q * w1 + x - x01)^2/(4 * (x - x01)^2 + w1^2)) y0 + B1 } p1 - c(0.1, 1, 1, 5, 1) p2 - c(0.5, 0.7, 0.2, 4, 1) p3 - c(0, 0.5, 3, 1.2, 1) y - fano(p1, x) + fano(p2, x) + fano(p3, x) jitter(y, a=0.005*max(y)) } y - exampleFunction(x) sample.df - data.frame(x = x, y = y) with(sample.df, plot(x, y, t=l)) # there are three peaks, located around x=2, 4 ,5 y.spl - smooth.spline(x, y) # smooth the noise lines(y.spl, col=red) I wish to obtain the zeros of the first and second derivatives of the smoothed spline y.spl. I can use uniroot() or optim() to find one root, but I cannot find a reliable way to iterate and find the desired number of solutions (3 peaks and 2 inflexion points on each side of them). I've used locator() or a guesstimate of the disjoints intervals to look for solutions, but I would like to get rid off this unnecessary user input and have a robust way of finding a predefined number of solutions in the total interval. Something along the lines of: findZeros - function( f , numberOfZeros = 3, exclusionInterval = c(0.1 , 0.2, 0.1) { # # while (number of solutions found is less than numberOfZeros) # search for a root of f in the union of intervals excluding a neighborhood of the solutions already found (exclusionInterval) # } I could then apply this procedure to the first and second derivatives of y.spl, but it could also be useful for any other function. Many thanks for any remark of suggestion! baptiste -- View this message in context: http://www.nabble.com/recursive-root-finding-tp18868013p18894331.html Sent from the R