Re: [R] non-linear integer optimization?

2010-09-23 Thread Hans W Borchers
darckeen darckeen at hotmail.com writes:

 This is an example of the type of problem, and how i'm currently using
 optim() to solve it.


I would classify this as an integer programming (IP) problem, it is not 
even non-linear, as there are no continuous variables. Your approach with 
optim() will not work, especially not with method 'L-BFGS-B' for numerical
optimization. Maybe it's worth to try method 'SANN'.

Integer programming needs special techniques. If you don't know more about
your function, it may boil down to an exhaustive discrete search that will
be slow anyway. 

Hans Werner

Example: With

set.seed(8232); mydata - runif(500,-1,1)   # fix these parameters

the 'optim()' approach finds a minimum of 0.9887148 with L-BFGS-B and
0.9139314 with method SANN, while the global minimum is 0.7360688 in your
constraint area. DEoptim (in package DEoptim) returns 0.750277. The minimum
is more difficult to find in this example as it appears on the boundary.


 mydata - runif(500,-1,1)
 
 myfunc - function(p,d)
 {
   print(p - floor(p))
   ws - function(i,n,x) sum(x[i-n+1]:x[i])
   ws1 - c(rep(NA,p[1]-1),sapply(p[1]:NROW(d),ws,p[1],d))
   ws2 - c(rep(NA,p[2]-1),sapply(p[2]:NROW(d),ws,p[2],d))
   ws3 - c(rep(NA,p[3]-1),sapply(p[3]:NROW(d),ws,p[3],d))
   var(ws1+ws2+ws3,na.rm=TRUE)
 }
 
 opt - optim(c(25,50,150),myfunc,method=L-BFGS-B,
   
 control=list(fnscale=-1,parscale=c(1,1,1),factr=1,ndeps=c(5,5,5)),
   lower=t(c(1,51,101)),upper=t(c(50,100,200)),d=mydata)
 
 print(floor(opt$par))
 print(myfunc(opt$par,mydata))
 
 So the parameters to the function to be optimized are parameters to
 functions that only accept integer values.  All of the paramters to be
 optimized are integers that are subject to upper lower bound constraints. 
 This was the solution I came up with after checking CRAN, searching nabble
 etc.  
 
 It runs but not very efficiently, and does not seem to really explore the
 sample space very well before focusing on a local minimum.  I also looked at
 the constrOptim function but I couldn't figure out how to implement two
 sided constraints with it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bivariate vector numerical integration with infinite range

2010-09-21 Thread Hans W Borchers
baptiste auguie baptiste.auguie at googlemail.com writes:

 
 Dear list,
 
 I'm seeking some advice regarding a particular numerical integration I
 wish to perform.
 
 The integrand f takes two real arguments x and y and returns a vector
 of constant length N. The range of integration is [0, infty) for x and
 [a,b] (finite) for y. Since the integrand has values in R^N I did not
 find a built-in function to perform numerical quadrature, so I wrote
 my own after some inspiration from a post in R-help,

The function adaptIntegral() in the 'cubature' package integrates 
multi-valued functions over n-dimensional finite hypercubes, as do the
functions in 'R2Cuba'.
If the hypercube is (partly) infinite, a transformation such as x -- 1/x
per infinite axis (as in NR) has to be applied.

For two dimensions, another approach could be to apply the integrate()
function twice, because this 1-dimensional integration function can handle
infinite intervals.
Hint: First inegrate over the finite dimension(s).

Example: Integrate sin(x)/exp(y) for 0 = x = pi, 0 = y = Inf (value: 2)

f1 - function(y) 1/exp(y)
f2 - function(x) sin(x) * integrate(f1, 0, Inf)$value
integrate(f2, 0, pi)
# 2 with absolute error  2.2e-14

Note that the absolute error is not correct, as the first integration has
its own error term. You have to do your own error estimation.

Hans Werner

 
 [...]
 
 So it seems to work. I wonder though if I may have missed an easier
 (and more reliable) way to perform such integration using base
 functions or an add-on package that I may have overlooked.
 
 Best regards,
 
 baptiste
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bivariate vector numerical integration with infinite range

2010-09-21 Thread Hans W Borchers
baptiste auguie baptiste.auguie at googlemail.com writes:

 
 Thanks, adaptIntegrate() seems perfectly suited, I'll just need to
 figure a transformation rule for the infinite limit. The suggestion of
 x-1/x does not seem to work here because it also transforms 0 into
 -infinity. I think exp(pi* sinh(x)) could be a better choice,
 according to Numerical Recipes.

Yes, that's one way.
But you can also split the integral in two parts, one from 0 to 1 and then
from 1 to Inf. The first one is a finite hypercube and the second can be
transformed with x -- 1/x into [0, 1].

I usually prefer the second approach for higher-dimensional applications
as the Jacobian appears to be simpler. In the literature you will find 
discussions on how far out the finite hypercube should reach for lowering
the absolute error.

Hans Werner

 Thanks,
 
 baptiste


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prime Factorization

2010-09-21 Thread Hans W Borchers
David Winsemius dwinsemius at comcast.net writes:

 
 On Sep 21, 2010, at 11:33 AM, Cliff Clive wrote:
 
 
  Hi everyone, I have a very quick question:
 
  Is there a ready-made function in R or any R packages to find the  
  prime
  factorization of an integer?
 
 Yes. At least two. The obvious search strategy with your favrite  
 search tool should work well.


I don't know which prime factorization functions David means --- sometimes
these search tips are quite a mystery to me.

You didn't tell us an important parameter, namely the size of integers you
want to factorize. For larger numbers there is just one such function that
can be taken into consideration, i.e. factorize() in the multiple precision
package 'gmp'.

library(gmp)
factorize(2^32+1)
# [1] 641 6700417

And better do not use the 'schoolmath' package, this package is of very low
quality and probably should be removed from CRAN.

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bivariate vector numerical integration with infinite range

2010-09-21 Thread Hans W Borchers
baptiste auguie baptiste.auguie at googlemail.com writes:

 
 Thanks. I am having trouble getting adaptIntegrate to work with a
 multivalued integrand though, and cannot find a working example.
 Anyone had better luck with it?

The function to be integrated needs a vector as input:

f - function(x) {
res - 1 / (sqrt(x[1])*(1+x[1]))
c(res, res/2, 2*res)
}

adaptIntegrate(f, lowerLimit=c(0.1, 0), upperLimit=c(10, 1), fDim = 3)
$integral
[1] 1.9164832 0.9582416 3.8329665

$error
[1] 1.265252e-05 6.326261e-06 2.530504e-05

$functionEvaluations
[1] 323

   $returnCode
   [1] 0

Hans Werner

 library(cubature)
 
  f - function(x, y) {
 +   res - 1 / (sqrt(x)*(1+x))
 +   c(res, res/2, 2*res)
 + }
 
  adaptIntegrate(f, lowerLimit=c(0.1, 0), upperLimit=c(10, 1), fDim = 3)
 [1] adaptIntegrate: Error in evaluation function f(x) for x=
res
 [1,] 0.07355275 0.03677638 0.1471055
 [2,] 0.94280904 0.47140452 1.8856181
 Error in adaptIntegrate(f, lowerLimit = c(0.1, 0), upperLimit = c(10,  :
  adaptIntegrate: Result f(x) is not numeric or has wrong dimension
 
 Best,
 
 baptiste


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prime Factorization

2010-09-21 Thread Hans W Borchers
David Winsemius dwinsemius at comcast.net writes:

 
 A further citation that answers the question I raised (and  
 inaccurately predicted no value) regarding prime.sieve :
 
 http://finzi.psych.upenn.edu/R/Rhelp02/archive/49773.html
 
 This was found with Barons search facility set for rhelp postings:
 
 [...]

Just being curious:

And did you also find the probably most important and professional of
all factorization routines in R, based on the Pollard Rho algorithm,
factorize() in package 'gmp'?

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimized value worse than starting Value

2010-09-09 Thread Hans W Borchers
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:

 
 On Wed, Sep 8, 2010 at 1:35 PM, Michael Bernsteiner
 dethlef1 at hotmail.com wrote:
 
  Dear all,
 
  I'm optimizing a relatively simple function. Using optimize the optimized
  parameter value is worse than the starting. why?


I would like to stress here that finding a global minimum is not as much
sorcery as this thread seems to suggest. A widely accepted procedure to 
provably identify a global minimum goes roughly as follows
(see Chapt. 4 in [1]):

  - Make sure the global minimum does not lie 'infinitely' for out.
  - Provide estimations for the derivatives/gradients.
  - Define a grid fine enough to capture or exclude minima.
  - Search grid cells coming into consideration and compare.

This can be applied to two- and higher-dimensional problems, but of course
may require enormous efforts. In science and engineering applications it is at
times necessary to really execute this approach.

Hans Werner

[1] F. Bornemann et al., The SIAM 100-Digit Challenge, 2004, pp. 79.

In fact, a slightly finer grid search will succeed in locating the
 proper minimum; several teams used such a search together with estimates
 based on the partial derivatives of f to show that the search was fine
 enough to guarantee capture of the answer.


  This looks familiar. Is this some 1-d version of the Rosenbrock
 Banana Function?
 
  http://en.wikipedia.org/wiki/Rosenbrock_function
 
  It's designed to be hard to find the minimum. In the real world one
 would hope that things would not have such a pathological behaviour.
 
  Numerical optimisations are best done using as many methods as
 possible - see optimise, nlm, optim, nlminb and the whole shelf of
 library books devoted to it.
 
 Barry


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculating area between plot lines

2010-09-07 Thread Hans W Borchers
A. Marcia BARBOSA anamarciabarbosa at gmail.com writes:
 
 Hi everyone. I have these data:
 
 probClass-seq(0,0.9,0.1)
 prob1-c(0.0070,0.0911,0.1973,0.2949,0.3936,0.5030,0.5985,0.6869,0.7820,0.8822)
 prob2-c(0.0066,0.0791,0.2358,0.3478,0.3714,0.3860,0.6667,0.6400,0.7000,1.)
 
 # which I'm plotting as follows:
 
 plot(probClass,prob1,xlim=c(0,1),ylim=c(0,1),xaxs='i',yaxs='i',type=n)
 lines(probClass,prob1)
 lines(probClass,prob2)
 polygon(c(probClass,rev(probClass)),c(prob2,rev(prob1)),col=red,border=NA)
 
 Given that the total area of the plot is 1, how can I calculate the
 area between the plotted lines (red polygon)? I have only found the
 areapl function in the splancs package, but it doesn't work for
 self-intersecting polygons..
 
 Any help will be gratefully received. Cheers,
 Márcia
 

Remember that the area between two curves is the same as the integral of the
difference between these two curves (resp. its absolute values); thus it is
easy to compute the area directly:


f1 - approxfun(probClass, prob1-prob2) # piecewise linear function
f2 - function(x) abs(f1(x)) # take the positive value

integrate(f2, 0, 0.9)
0.03548023 with absolute error  1.6e-06


where the true value is 0.03547927 (using the triangle/trapez area formula).

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding naive baysian classifier in R

2010-08-31 Thread Hans W Borchers
Vishnampettai akron_aadhithya at yahoo.com writes:
 Hi,
   I have a small doubt regarding naive Bayes. I am able to classify the
 data's properly but i am just stuck up with how to get the probability
 values for naive bayes. In the case of SVM we have attr function that
 helps in displaying the probability values. Is there any function similar to
 attr in naive Bayes that can be used for displaying the attribute values.

Some time ago I wrote my own naive Bayes classifier as I was unsatisfied with
the speed of other implementations for larger data sets. My predict function
returns an 'accuracy' with every input item based on the standard calculation.

If you like I can send it to you as I am also interested in more test cases,
though I did some tests comparing it with results from 'e1071::naiveBayes'.

Hans Werner

P. S.:  What are you doing with decision.values=TRUE, probability=TRUE?
I can't find those in the documentation for naiveBayes. I assume you don't
get an error because the ... catches all these unused parameters.

 my code is given below:
 library(e1071)
 train-read.table(...,header=T);
 test-read.table(...,header=T);
 length(test);
 cl - c(c(rep(ALL,10), rep(AML,10)));
 cl - factor(cl)
 model - naiveBayes(t(train),cl);
 pred-predict(model, t(test),decision.values=TRUE,probability=TRUE)
 attr(pred,probabilities);
 table(pred,t(cl));

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LSAP and the Hungarian algorithm [was: R-help]

2010-08-27 Thread Hans W Borchers
venkatesh bandaru venkatesh.bandaru at gmail.com writes:
 

If you had searched for it, you would have easily found the 'clue' package:

In package clue solve_LSAP() enables the user to solve the linear
sum assignment problem (LSAP) using an efficient C implementation
of the Hungarian algorithm.

Hans Werner

 Respected  R Help Team Members,
 
 I am venkatesh .B , doing mtech in *University of Hyd,HYDERABAD.   *i want
 know , is there any package that contains Hungarian algorithm, that solves
 linear assignment problem.
 thanking you.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LSAP and the Hungarian algorithm [was: R-help]

2010-08-27 Thread Hans W Borchers
Ravi Varadhan rvaradhan at jhmi.edu writes:
 
 However, The clue package has the solve_LSAP() function (as pointed out by
 Hans Werner)  that solves the linear sum assignment problem, but it uses the
 Hungarian algorithm that was requested by you.
 
 The lp.assign() function in lpSolve package (as pointed out by Girish)
 also solves the linear sum assignment problem, but I do not know which
 algorithm is used.  This is not documented in the help page.  

lp.assign() solves the linear assignment problem as a standard linear program
with binary variables, see the Assignment Problem article in the Wikipedia.
With 'lpSolve', this will be slower than the Hungarian algorithm in higher
dimensions.

But remember our thread A combinatorial optimization problem: finding the
best permutation of a complex vector in Nov. 2009. Highly efficient software
for integer programming -- such as ILOG's commercial CPLEX program -- will be
much faster than the specialized Hungarian algorithm when the problem size
grows considerably (as was shown in Erwin Kalvelagen's blog).

Hans Werner

 Best,
 Ravi.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Puzzle

2010-08-26 Thread Hans W Borchers
Ben Holt BHolt at bio.ku.dk writes:
 I have data similar to this:
 
 Location Surveyor Result
 A1 83
 A2 76
 A3 45
 B1 71
 B4 67
 C2 23
 C5 12
 D3 34
 E4 75
 F4 46
 G5 90
 etc (5 million records in total)
 
 I need to divide the data to many subsets then randomly select 5 different
 locations and 5 different surveyors (one at each of the 5 randomly selected
 locations) for each subset.
 
 The function I have written basically picks five locations and then 1
 surveyor in each location, checks that there are five different surveyors
 and if there isn't tries again. The problem is that for some subsets this
 doesn't work.
 
 Some subsets don't have enough locations/surveyors or both, but this can be
 checked for easily. The problem subsets do have enoughs locations and
 surveyors but still cannot produce 5 locations each with a different
 surveyor. The matrix below demonstrates such a subset:
  
   locations
   A B C D E
 1 1 0 0 0 0
 Surveyors   2 1 0 0 0 0
 3 1 0 0 0 0
 4 1 0 0 0 0
 5 1 1 1 1 1
 
 I cannot think of a way to check for such a situation and therefore I have
 simply programmed the function to give up after 100 attempts if it can't
 find a solution. This is not very satisfactory however as the analysis takes
 a very long time to run and it would also be very useful useful for me to
 know how many suitable solution there are.

If I understand you correctly:  The task of finding a maximal number of
'independent' pairs (of surveyors and locations) is called the generalized
(or: maximum) linear assignment problem, see

http://en.wikipedia.org/wiki/Generalized_assignment_problem

and there appears not to exist an efficient algorithm. One can think of
procedures to find at least 5 such pairs (or to report infeasibility), but
I think with millions of records in one of those subsets any approach (using
loops) will be intolerable slow.

Perhaps you can rethink your original intention or try to provide additional
information during the process of generating the data.

Of course, I would be glad to hear more positive news as this kind of problem
does show up in many discrete optimization tasks.

Hans Werner

 I reckon some of you clever folk out there must be able to think of a better
 solution.
 
 Any advice appreciated,
 
 Ben


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem to compute a function with very large numbers

2010-08-20 Thread Hans W Borchers
Nan Zhao nzhao at student.ethz.ch writes:

 
 Thank you Dennis for your explanations!
 
 The results you found are the same as mine. with first an infinity result,
 followed by NaN. It seems that, when the number becomes too small, R must
 round it up to 0. Hence I was wondering if there might a way to increase the
 number of decimals for extremal computations. I have tried to use double
 variables, but this didn't have a bigger success.
 
 There is no doubt about the function and it orignates from a Kendall's plot
 in measuring dependencies.
 
 Thank you in advance,
 
 Best Regards,
 
 Nan
 

See the thread Problem with the precision of numbers in January this year.
For example in

http://finzi.psych.upenn.edu/Rhelp10/2010-January/225640.html

you will also find a hint how to compute binomial coefficients with 'Rmpfr'.

Integration will be a problem here that you may have to replace with your own
method for integration, such as Simpson's adaptive formula.

Regards,  Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Games

2010-08-13 Thread Hans W Borchers
Silvano silvano at uel.br writes:
 
 Hi,
 
 I want to build the table of a football league with 11 
 teams. All play together. So will 55 games.
 Since there are an odd number of teams in each round a team 
 will not play.

The easy solution is moving around a table with one team pausing. 

# Playing schedule for an odd number of teams

n - 5
noTeams - 2*n+1
noGames - n*noTeams
teams - paste(T, 1:noTeams, sep=)

rounds - numeric(noGames)
team1 - team2 - character(noGames)

for (i in 1:noTeams) {
for (j in 1:n) {
k - n*(i-1)+j
rounds[k] - i
team1[k] - teams[j+1]
team2[k] - teams[noTeams-j+1]
}
teams - c(teams[2:noTeams], teams[1])
}

schedule - data.frame(rounds=rounds, team1=team1, team2=team2)

Hans Werner

 The games will be:
 
 games = urnsamples(1:11, x = 
 c('A','B','C','D','E','F','G','H','I','J','K'), size=2, 
 replace=F,
 ordered=FALSE)
 games
 
 As will be five games per round. How to build a table with 
 all the championship rounds, automatically?
 I thought about something like:
 
 game1 = c(
 sample(11,2)
 sample(11,2)
 sample(11,2)
 sample(11,2)
 sample(11,2)
 )
 
 but, isn't work very well.
 
 Some suggestion?
 
 --
 Silvano Cesar da Costa
 Departamento de Estatística
 Universidade Estadual de Londrina
 Fone: 3371-4346
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why use numFold in evaluate_Weka_classifier of RWeka

2010-08-10 Thread Hans W Borchers
s0300851 s0300851 at tp.edu.tw writes:

 
 Hi everyone,
 
 I have a question about using RWeka package,
 we know that instruction make_Weka_classifier that can help 
 us to build a model,and evaluate_Weka_classifier instruction
 can help us to evaluate the performance of  the model using on new data.
 But I have a question about how to using the parameter numFold in
 evaluate_Weka_classifier.Cross-validation means that using some parts 
 to train our data,and some parts to do test,but it should be using in 
 the step of building our model not evaluation.
 I try to think about the numFold=n in the evaluate_Weka_classifier may be 
 this:
 randomly(but in proportion) to select data in the dataset then redo n times,
 then to evaluate the performance.Is this correct?

No. It's preferable to learn about Weka right from the Weka manual.
About the number of folds ('numFold') it says:

A more elaborate method is cross-validation. Here, a number of
folds n is specified. The dataset is randomly reordered and then
split into n folds of equal size. In each iteration, one fold is
used for testing and the other n-1 folds are used for training the
classifier. The test results are collected and averaged over all
folds. This gives the cross-validation estimate of the accuracy.

 Thanks.
 Best regards ,
 
 Hsiao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization problem with nonlinear constraint

2010-07-28 Thread Hans W Borchers
Uli Kleinwechter u.kleinwechter at uni-hohenheim.de writes:

 
 Dear Ravi,
 
 As I've already written to you, the problem indeed is to find a solution 
 to the transcendental equation y = x * T^(x-1), given y and T and the 
 optimization problem below only a workaround.


I don't think optimization is the right approach for simply inverting
a simple function.

The inverse of the function  x \to x * e^x  is the Lambert W function.
So the solution in your case is:

W(log(T)*y*T) / log(T)  # hope I transformed it correctly.

Now, how to compute Lambert's W ? Well, look into the 'gsl' package
and, alas, there is the function lambert_W0.

Your example:

y -   3
T - 123

library(gsl)
lambert_W0(log(T)*y*T)/log(T)
# [1] 1.191830


Regards,  Hans Werner


 
 John C. Nash has been so kind to help me on here. In case anyone faces a 
 similar problem in the future, the solution looks as follows:
 
 *
 
 func1 - function(y,x,T){
  out - x*T^(x-1)-y
  return(out)
 }
 
 # Assign the known values to y and T:
 y -   3
 T - 123
 
 root - uniroot(func1,c(-10,100),y=y,T=T)
 print(root)
 
 
 
 Somewhat simpler than I thought
 
 Thanks again!
 
 Uli


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hurst Exponent Estimation

2010-07-22 Thread Hans W Borchers
Lorenzo Isella lorenzo.isella at gmail.com writes:
 
 Dear All,
 I am a novice when it comes to time-series analysis and at the moment I
 am actually interested in calculating the Hurst exponent of a time
 series. 


Some time ago I tested some of the classical chaotic time series
(such as the logistic map and others, no financial time series) with
available functions in R and Matlab. In my experience, Peng's method
(realized in R as fArma::pengFit) works reasonably reliable and is
more accurate than most others on these series.

Unfortunately, the available R and Matlab implementations of the same
method -- and refering back to the same literature article -- can give
quite different results, with varying success for both sides.

AFAIK, in TISEAN there is no function estimating the Hurst exponent.

Regards
Hans Werner


 This question has already been asked quite some time ago
 
 http://bit.ly/98dZsi
 
 and I trust some progress has been made ever since.
 I was able to find some functions in the packages 
 
 http://cran.r-project.org/web/packages/Rwave/index.html and 
 http://cran.r-project.org/web/packages/fArma/index.html
 
 Allegedly, there should be functions for this in the Rtisean package
 
 http://cran.r-project.org/web/packages/RTisean/index.html
 
 but I have not been able to find them.
 Bottom line: if you have a time series (list of empirical data of
 varying length and not necessarily sampled on a uniform time grid) what
 R tool would you use to estimate its Hurst exponent?
 Any suggestion is appreciated.
 Cheers
 
 Lorenzo
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hurst Exponent Estimation

2010-07-22 Thread Hans W Borchers
Spencer Graves spencer.graves at structuremonitoring.com writes:
 
   Have you tried something like the following:
 
 library(sos)
 H - ???Hurst
 summary(H)
 H
 
This identified 50 links in 15 packages, and displayed the 
 results in a table in a web browser with links to the best match in the 
 best package first.  The installPackages and writeFindFn2xls 
 functions are designed to produce an Excel file with a summary page that 
 can help you identify the package that seems to be most actively 
 maintained among the relevant packages, as explained in a vignette.
 
Hope this helps.
Spencer Graves


Dear Spencer,

sometimes I am a bit worried about answers that simply redirect
the requestor to searching the Internet or R sites. It fuels the
impression that experience is not so much asked for nowadays.

Anyway, I looked into the 'fractal' package that comes up in your
search. The same applies here that I said before. For none of the
test series the result of computing H was nearly as accurate as
what 'pengFit' returned.

I really would be interested to hear from others about experiences
with financial times series.

Best, Hans Werner


 On 7/22/2010 12:25 AM, Hans W Borchers wrote:
  Lorenzo Isellalorenzo.isellaat  gmail.com  writes:
  Dear All,
  I am a novice when it comes to time-series analysis and at the moment I
  am actually interested in calculating the Hurst exponent of a time
  series.
 
  Some time ago I tested some of the classical chaotic time series
  (such as the logistic map and others, no financial time series) with
  available functions in R and Matlab. In my experience, Peng's method
  (realized in R as fArma::pengFit) works reasonably reliable and is
  more accurate than most others on these series.
 
  Unfortunately, the available R and Matlab implementations of the same
  method -- and refering back to the same literature article -- can give
  quite different results, with varying success for both sides.
 
  AFAIK, in TISEAN there is no function estimating the Hurst exponent.
 
  Regards
  Hans Werner
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Double Integration

2010-07-03 Thread Hans W Borchers
Bogaso Christofer bogaso.christofer at gmail.com writes:
 
 Hi Ravi, your suggestion helped me as well a lot. If I look into
 that function, I see this function is calling another function  :

 .Call(doCubature, as.integer(fDim), body(f.check), 
 as.double(lowerLimit), as.double(upperLimit),
 as.integer(maxEval), 
 as.double(absError), as.double(tol), new.env(), PACKAGE =
 cubature)

 How I can see the interior of this doCubature?


Find the original code for the 'cubature' package at

http://ab-initio.mit.edu/wiki/index.php/Cubature

plus information why the 'adapt' package had to be abandoned and that
'cubature' is based on the same original algorithm of Genz and Malik,
but using free and GPLed software.

We should not bemoan the loss of the 'adapt' package, 'cubature' and
'R2cuba' are worthy successors for adaptive quadrature.

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Double Integration

2010-07-01 Thread Hans W Borchers
Sarah Sanchez sarah_sanchez09 at yahoo.com writes:
 
 Dear R helpers
 
 I am working on the Bi-variate Normal distribution probabilities.
 I need to double integrate the following function
 (actually simplified form of bivariate normal distribution)
 
 f(x, y) = exp [ - 0.549451 * (x^2 + y^2 - 0.6 * x * y) ]
 
 where 2.696  x  3.54 and -1.51  y  1.98
 
 I need to solve something like
 
 INTEGRATE (2.696 to 3.54) dx INTEGRATE [(-1.51 to 1.98)] f(x, y) dy
 
 I have referred to stats::integrate but it deals with only one variable. 
 
 This example appears in Internal Credit Risk Model by Michael Ong
 (page no. 160).

There has been the same request last month, the answer is still valid:

library(cubature)
fun - function(x) exp(-0.549451*(x[1]^2+x[2]^2-0.6*x[1]*x[2]))
adaptIntegrate(fun, c(-1.51,1.98), c(2.696,3.54), tol=1e-8)

# $integral
# [1] 0.1376102
   # $error
# [1] 1.372980e-09
# ...

Hans Werner


 Kindly guide.
 
 Regards
 
 Sarah


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] double integral

2010-06-18 Thread Hans W Borchers
suman dhara suman.dhara89 at gmail.com writes:

 
 Sir,
 I want to calculate double integral in R. Is there any function to do this?


If your domain of integration is a hypercube, try packages 'cubature'
or 'R2cuba'.
Otherwise, you have to uncover more information about your specific
problem and/or function.

Hans Werner


 Regards,
 Suman Dhara
 
   [[alternative HTML version deleted]]
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logarithmic integrals in R?

2010-05-30 Thread Hans W. Borchers
Oliver Kullmann O.Kullmann at swansea.ac.uk writes:
 
 Thanks for the information.

What I meant were formulas like

\int 1/\log(t)^2 dt = -t/\log(t) + li(t)

\int 1/\log(t)^3 dt = 1/2 * ( -t/\log(t)^2 - t/\log(t) + li(t) )

and higher forms that can be expressed through the Gamma function.
I am certain I 've seen them in AandS' handbook (where else?), but
sure cannot remember in which chapter or page.

Which logarithmic integrals do you really need, and on what range?

Regards,
Hans Werner

 On Sat, May 29, 2010 at 01:15:29PM +, Hans W. Borchers wrote:
  Oliver Kullmann O.Kullmann at swansea.ac.uk writes:
  
   
   Hello,
   
   I couldn't find information on whether the logarithmic integrals
   
   Li_m(x) = integral_0^x log(t)^(-m) dt
   
   for x = 0 are available in R?
  
[...]
 
 I found gsl at http://cran.r-project.org/web/packages/gsl/index.html.
 
  and elliptic integrals are part of the 'gsl' package, so
  
  library('gsl')
  x - seq(2, 10, by=0.5)
  y - expint_Ei(log(x))
  y
  
  See e.g. the Handbook of Mathematical Functions for how to reduce higher
  logarithmic integrals.
 
 However here I wasn't succesful: Going through the chapter
 
 http://www.math.ucla.edu/~cbm/aands/page_228.htm
 
 I didn't find any mentioning of the higher logarithmic integrals.
 
[...]
 
 Also a google search on higher logarithmic integrals, logarithmic 
 integrals
 or li_n(x) doesn't reveal anything, so I would be thankful for a hint.
 
 Thanks again!
 
 Oliver


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logarithmic integrals in R?

2010-05-29 Thread Hans W. Borchers
Oliver Kullmann O.Kullmann at swansea.ac.uk writes:

 
 Hello,
 
 I couldn't find information on whether the logarithmic integrals
 
 Li_m(x) = integral_0^x log(t)^(-m) dt
 
 for x = 0 are available in R?

I saw your request only this weekend.
The first logarithmic integral can be computed using the exponential
integral Ei(x) per

li(x) = Ei(log(x))

and elliptic integrals are part of the 'gsl' package, so

library('gsl')
x - seq(2, 10, by=0.5)
y - expint_Ei(log(x))
y

See e.g. the Handbook of Mathematical Functions for how to reduce higher
logarithmic integrals.
Another possibility is to use the Web API of 'keisan', the calculation
library of Casio.

Regards
Hans Werner

 Best wishes
 
 Oliver
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best polynomial approximation

2010-05-18 Thread Hans W Borchers

I guess you may be looking for the Remez algorithm. AFAIK there is no
implementation in one of the R packages. You can find FORTRAN code in the
Collected Algorithms of the ACM (no. 604) which probably could be called
from R.

There appears to exist a discrete, equi-distant(?) version as function
'remez' in the signal package, if that is of any help to you. I have never
used it.

Regards,  Hans Werner

P.S.: The Chebyshev polynomials do not compute the best polynomial
approximation, but they provide a nice way to estimate the maximal distance
to this best approximating polynomial.



Patrizio Frederic wrote:
 
 Dear R-users,
 I learned today that there exists an interesting topic in numerical
 analysis names best polynomial approximation (BSA). Given a function
 f  the BSA of degree k, say pk, is the polynomial such that
 
 pk=arginf sup(|f-pk|)
 
 Although given some regularity condition of f, pk is unique, pk IS NOT
 calculated with least square. A quick google tour show a rich field of
 research and many algorithms proposed for computing such a task.
 
 I was wondered if some of you knows about some R implementations
 (packages) for computing BSA.
 
 Many thanks in advance,
 
 Patrizio
 
 as usual I apologize for my fragmented English
 
 -- 
 +-
 | Patrizio Frederic, PhD
 | Assistant Professor,
 | Department of Economics,
 | University of Modena and Reggio Emilia,
 | Via Berengario 51,
 | 41100 Modena, Italy
 |
 | tel:  +39 059 205 6727
 | fax:  +39 059 205 6947
 | mail: patrizio.frede...@unimore.it
 +-
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/best-polynomial-approximation-tp2220439p2221042.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find a rectangle of maximal area

2010-03-22 Thread Hans W Borchers
Hans W Borchers hwborchers at googlemail.com writes:
 
 For an application in image processing -- using R for statistical purposes --
 I need to solve the following task:
 
 Given n (e.g. n = 100 or 200) points in the unit square, more or less
 randomly distributed. Find a rectangle of maximal area within the square
 that does not contain any of these points in its interior.
 
 If a, b are height and width of the rectangle, other constraints may have to
 be imposed such as  a, b = 0.5  and/or  0.5 = a/b = 2.0 . The rectangle
 is allowed to touch the border of the square.

And yes, the sides of the rectangle shall be parallel to the sides of the
enclosing unit square (which could be a rectangle of some size, too).

 snip
 
 Thanks in advance for any suggestions,
 Hans Werner

Erwin Kalvelagen erwin.kalvelagen at gmail.com writes:

 I solved this with a simple minded MINLP formulation using BARON
 (a global solver). 
 This seems to produce solutions relatively quickly
 (somewhat slower for n=200). 
 Actually this solved easier than I expected. See:

Dear Erwin,

yes, it is possible to emulate an exhaustive search by applying binary
variables and utilizing an MI(N)LP solver. What did you need the'non-
linearity' for? (I am asking as you did not disclose your model.)

The examples on your blog do not take into account that the ratio of longer
to shorter side length of the rectangle shall be smaller than 2. Would it be
difficult to add this restriction to your model?

Unfortunately, there is no free MINLP solver available. Formerly I have called
a Python program to utilize solvers at NEOS. Probably it would be possible to 
write a similar R function to do this.

Still I believe that a clever approach might be possible avoiding the need to
call a commercial solver. I am getting this hope from one of Jon Bentley's
articles in the series Programming Pearls.

Regards,
Hans Werner

P.S.: If you copy my request into your blog, wouldn't it be nice to add a
  pointer back to the R-help entry where this question has been asked?

 http://yetanothermathprogrammingconsultant.blogspot.com/2010/03/
 looks-difficult-to-me-2.html

 
 Erwin Kalvelagen
 Amsterdam Optimization Modeling Group
 erwin at amsterdamoptimization.com
 http://amsterdamoptimization.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find a rectangle of maximal area

2010-03-22 Thread Hans W Borchers
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes:

 
 On Mon, Mar 22, 2010 at 4:28 PM, Hans W Borchers
 hwborchers at googlemail.com wrote:
 
  Still I believe that a clever approach might be possible avoiding the need 
  to
  call a commercial solver. I am getting this hope from one of Jon Bentley's
  articles in the series Programming Pearls.
 
 
 Is this the 'Largest Empty Rectangle' problem?
 
 http://en.wikipedia.org/wiki/Largest_empty_rectangle

Dear Barry,

thanks for this pointer. I never suspected this problem could have a name of its
own. Rethinking the many possible applications makes it clear: I should have
searched for it before.

I looked in some of the references of the late 80s and found two algorithms 
that appear to be appropriate for implementation in R. The goal is to solve the
problem for n=200 points in less than 10-15 secs.

Thanks again, Hans Werner


 I had a look at some of the references from Wikipedia, but they all
 follow a similar pattern, one I have noticed in many computer science
 journal articles:
 
  1. State a problem that looks tricky.
  2. Say We have an efficient algorithm for the problem stated in #1
  3. Proceed to derive, using much algebra and theory, the efficient algorithm.
  4. Stop.
 
 The idea of actually producing some dirty, filthy, actual code to
 implement their shiny algorithms never seems to cross their minds.
 
  I also found a similar question from 2008 asked on the R-sig-geo
 mailing list. That didn't get much help either!
 
 Sorry.
 
 Barry
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Find a rectangle of maximal area

2010-03-21 Thread Hans W Borchers
For an application in image processing -- using R for statistical purposes -- I
need to solve the following task:

Given n (e.g. n = 100 or 200) points in the unit square, more or less randomly
distributed. Find a rectangle of maximal area within the square that does not
contain any of these points in its interior.

If a, b are height and width of the rectangel, other constraints may have to be
imposed such as  a, b = 0.5  and/or  0.5 = a/b = 2.0 . The rectangle is
allowed to touch the border of the square.

For each new image the points will be identified by the application, like all
stars of a certain brightness on an astronomical picture. So the task will have
to be performed several times.

I assume this problem is computationally hard. I would like to find a solution
that is reasonably fast for  n = 100..200  points. Exhaustive search along the
x, y coordinates of the points will not be fast enough.

I know this request is not about R syntax and does not contain 'repro-code'. But
perhaps a somehow similar problem has been solved before.

Thanks in advance for any suggestions,
Hans Werner

P.S.: Example Is this rectangle of maximal area?

n - 100; set.seed(832)
x - runif(n); y - runif(n)
plot(c(0,1), c(0,1), type=n, axes=FALSE, asp=1,
 xlab=, ylab=, main=Rectangle Problem, sub=)
rect(0,0,1,1, col=darkblue)
xl-100; yu-43; xr-40; yo-3
area - (x[xr]-x[xl])*(y[yo]-y[yu])
rect(x[xl], y[yu], x[xr], y[yo],
 lty=2, col=darkblue, border=red)
text((x[xl]+x[xr])/2, (y[yu]+y[yo])/2,
 paste(area =, round(area, 4)), cex=0.75, col=red)
points(x, y, pch=20, col=white)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting

2010-02-06 Thread Hans W Borchers
David Neu david at davidneu.com writes:

 
 Hi,
 
 I have a list of vectors (of varying lengths).  I'd like to sort this
 list by applying a function to each pair of vectors in the list and
 returning information to sorting routine that let's it know which one
 is larger.
 
 To solve problems like this in Common Lisp, the sort function accepts
 a function as an argument.  The arguments to this function are two
 elements of the list which is being sorted.  The writer of the
 function returns t (TRUE in R) when the first argument to the function
 is larger than the second and nil (FALSE in R) otherwise.
 
 I'm wondering if there is some way to accomplish this in R.

Would the following function do what you want?

sortList - function(L, fun) L[order(sapply(L, fun))]

Here is my test and my understanding of your request;

L - list()  # define a list of vectors of varying length
for (i in 1:10) { n - sample(1:10, 1); L[[i]] - runif(n) }

Ls - sortList(L, mean)
sapply(Ls, mean)  # increasing mean values

Hans Werner

 Many thanks for any help!
 
 Cheers,
 David
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting

2010-02-06 Thread Hans W Borchers
David Neu david at davidneu.com writes:

 David Neu david at davidneu.com writes:

 Hi,

 I have a list of vectors (of varying lengths).  I'd like to sort this
 list by applying a function to each pair of vectors in the list and
 returning information to sorting routine that let's it know which one
 is larger.

 To solve problems like this in Common Lisp, the sort function accepts
 a function as an argument.  The arguments to this function are two
 elements of the list which is being sorted.  The writer of the
 function returns t (TRUE in R) when the first argument to the function
 is larger than the second and nil (FALSE in R) otherwise.

 I'm wondering if there is some way to accomplish this in R.

I don't know whether there is a way to do it with the 'base::sort' function
-- and I too would very much like to know for an application of my own --,
but you can always define your own sorting, like here a simple bubble sort:

bubbleSort.list - function(L, comp) {
stopifnot(is.list(L))
n - length(L)
if (n = 1) return(L)
for (i in 1:n) {
for (j in 2:n) {
b - L[[j]]
if (comp(L[[j]], L[[j-1]])) {
L[[j]] - L[[j-1]]
L[[j-1]] - b
}
}
}
return(L)
}

If your comparing function, for example, compares first length and then mean:

comp - function(L1, L2) {
  if (length(L1)length(L2) ||
 (length(L1)==length(L2)  mean(L1)mean(L2)))
return(TRUE)
  else 
return(FALSE)
}

then the following test example will turn out to be correct:

L - list()
for (i in 1:100) L[[i]] - runif(sample(1:20, 1))

Ls - bubbleSort.list(L, comp)
is.unsorted(sapply(Ls, length))  # incl. mean for equal length

If bubblesort is too slow, implement your own heapsort or quicksort.

 Many thanks for any help!

 Cheers,
 David


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an optimization problem: selecting an quot; optimalquot; subset

2010-02-02 Thread Hans W Borchers


Erwin Kalvelagen-2 wrote:
 
 Hans W Borchers hwborchers at googlemail.com writes:
 # Prepare inputs for MILP solver
 obj - c(rep(0, n), 0, 1, 1, 0)
 typ - c(rep(B, n), B, C, C, B)
 mat - matrix(c(s, -z, -1, 1, 0,# a = a_p + a_m
 rep(0, n), 1, 0, 0, 0,  # constant term
 rep(0, n), 0, 1, 0, -M, # a_p = M * d0
 rep(0, n), 0, 0, 1, M,  # a_m = M * (d0-1)
 rep(1, n), 0, 0, 0, 0), # subset size = k
 nrow=5, byrow=T)
 dir - c(==, ==, =, =, =)
 rhs - c(0, 1, 0, M, k)
 max - FALSE
 
 You can drop the binary variable d0. 
 The condition one of a_p,a_m is zero holds
 automatically as we are minimizing a_p+a_m.
 
 
 Erwin Kalvelagen
 
 

Right; for me adding M and d0 is kind of a reflex, but that is only
necessary when  a_p + a_m  is used as an intermediate result.

One more remark: I saw some spurious behavior with both solvers (Rsymphony,
Rglpk) -- that is, slightly different results in different runs.  It could
relate to the tolerance with which these solvers compare and identify
solutions.  At the moment I don't know how to change the tolerance as a
parameter.

I guess this will not happen when using a more powerful solver such as
CPLEX.

Hans Werner

-- 
View this message in context: 
http://n4.nabble.com/Solving-an-optimization-problem-selecting-an-optimal-subset-tp1446084p1459843.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an optimization problem: selecting an quot;optimalquot; subset

2010-02-01 Thread Hans W Borchers
Dimitri Shvorob dimitri.shvorob at gmail.com writes:

 Given vector of numbers x, I wish to select an n-subset with sum closest
 fixed value s. Can anyone advise me how to approach this, in R? 

 I have considered Rcplex package, which handles integer/binary
 linear/quadratic optimization problems, but have difficulty setting up
 the quadratic form for [sum(x) - s]^2. 

 (Dynamic programming over [0, sum(x)]? A genetic algorithm? Can anyone
 contribute a binary GA optimization sample?)


Here is a solution completely done with R and the MILP solver Symphony
provided in the 'Rsymphony' package (but 'Rglpk' will work the same). It
finds subsets of size = k in a set s of size n.

The absolute deviation from the target value z is simulated by two
additional continuous (a_p and a_m) and and one binary variable d0.
To force positivity, the common Big-M trick is employed.

Hans Werner


library(Rsymphony)

n - 100; k - 20   # Find k-subset in set of size n
z - 2.0# whose elements sum up to z 

set.seed(8232); s - runif(n)   # reference example

M - sum(s) + z # for the Big-M trick

# Prepare inputs for MILP solver
obj - c(rep(0, n), 0, 1, 1, 0)
typ - c(rep(B, n), B, C, C, B)
mat - matrix(c(s, -z, -1, 1, 0,# a = a_p + a_m
rep(0, n), 1, 0, 0, 0,  # constant term
rep(0, n), 0, 1, 0, -M, # a_p = M * d0
rep(0, n), 0, 0, 1, M,  # a_m = M * (d0-1)
rep(1, n), 0, 0, 0, 0), # subset size = k
nrow=5, byrow=T)
dir - c(==, ==, =, =, =)
rhs - c(0, 1, 0, M, k)
max - FALSE

# Apply Sypmphony as our MILP solver
sol - Rsymphony_solve_LP(obj, mat, dir, rhs, types=typ, max = max)


x - sol$solution[1:n]  # Solution vector describing k-subset
( i - (1:n)[x==1] )#= [1] 18 25 44 81 95
print(sum(s[i]), digits=10) #= [1] 2.00205


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an optimization problem: selecting an optimal subset

2010-01-31 Thread Hans W Borchers


Dimitri Shvorob wrote:
 
 Same request to Hans:
 I am afraid I need a little more spoon-feeding following 
 
 I sent a GAMS script modeling this problem to the NEOS solvers
 
 Thanks a lot!
 

If you have access to CPLEX (I mean the commercial program, not Rcplex
which is just an interface to it), then I would suggest to follow Erwin's
proposal.
My GAMS file looks similar to the one Erwin posted. For instructions how to
send it to the NEOS solvers, see http://neos.mcs.anl.gov/neos/solvers/.

To replace an absolute value by two binary variables is an old trick in
optimization modeling. Here it will transform a subset sum problem into an
MILP task, solvable in R with lpSolve, Rglpk or Rsymphony.
But with a 100 binary variables it may be too much for these packages.
I will give it a try today.

Hans Werner

-- 
View this message in context: 
http://n4.nabble.com/Solving-an-optimization-problem-selecting-an-optimal-subset-tp1446084p1457982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an optimization problem: selecting an quot;optimalquot; subset

2010-01-30 Thread Hans W Borchers
Dimitri Shvorob dimitri.shvorob at gmail.com writes:

 
  Is it a subset of a vector containing 100 elements, or 1ths? 
 
 I need to pick 2-40 elements out of a 50-200-element-long vector.
 
  A random number of elements that should be chosen, or the best 10 values
  which sums up to a defined value? 
 
 The best 10 values. 
 
 I still think that Rcplex is the way to go; what's missing is some
 linear-algebra expertise on my part to set up the problem as quadratic. 
 

This is a subset sum problem and has been discussed here in December for
integers. For floats it is even more difficult as there is no reliable
stopping criterion.

Can you settle for an approximate solution? If you insist on the true
minimum, you will have to live with the combinatorial explosion searching
through all (appropriate) subsets.

Rcplex: This is a combinatorial problem and cannot be formulated as a
quadratic optimization problem.
It is NP-hard and cannot be solved via Dynamic Programming.
As a combinatorial problem, it also will not yield to GA approaches, at
least not to find the minimum reliably.

Hans Werner

P.S.:   Example (solution maybe is the best possible)

set.seed(8232)
x - runif(100)

# Find subset that sums up close to 2.0 !
i - c(84,54,11,53,88,12,26,45,25,62,96,23,78,77,66,1)
sum(x[i])
#= 2.000451

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving an optimization problem: selecting an quot;optimalquot; subset

2010-01-30 Thread Hans W Borchers
Dimitri Shvorob dimitri.shvorob at gmail.com writes:


  This is a subset sum problem and has been discussed here in December 
 
 Thanks a lot! Will investigate.
 
  Can you settle for an approximate solution? 
 
 Absolutely.

You can use the script from the thread subset sum problem to find
approximate solutions. The trick is to round and multiply the numbers in
the set with some power of 10 and then solve as an integer problem. I did
that for the example and found a solution 2.02 for target 2.0 and with
a subset of length 15.

  Rcplex: This is a combinatorial problem and cannot be formulated as a
  quadratic optimization problem. 
 
 If the objective function can fit the pattern, we need to find the set of n
 coefficients, taking values 0 or 1, summing to m, for the m-out-of-n
 problem. 'Binary' version of Rcplex apparently would be able to handle that.

Right, you can force CPLEX to perform the exhaustive search itself, I would
not call this quadratic programming, but it might be quite fast.

Taking my example set.seed(8232); x - runif(100) with target value 2.0 and
subsets of fixed length 10, I sent a GAMS script modeling this problem to the
NEOS solvers and it returned the solution

i - c(7,10,25,32,45,76,77,78,81,96)
sum(x[i])
# [1] 2.06

within 0.004 seconds of execution time for the BARON MINLP solver.

  It is NP-hard and cannot be solved via Dynamic Programming. 
 
 Why not? Discretize the [0, sum(x)] range and solve an m-step DP problem.
 The value function would minimize the distance from s, and penalize
 too-short (m*  m) subsets.

No, this will not work.

Hans Werner

 Thanks again! 
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with the precision of numbers

2010-01-25 Thread Hans W Borchers
kayj kjaja27 at yahoo.com writes:
 
 
 Hi All,
 
 thank you all for your help. I have tried Bill's script and it works! so I
 am trying to think what was the problem and it looks like it i sthe
 precision. so you defined a function of the precision and evaluates at
 precision=500. Bill, I was wondering how did you choose 500? is it
 arbitrary?
 

Your question was I was wondering if it is possible to increase the precision
using R, but you never told which accuracy you really need, As a rule of thumb,
the number of correct decimal digits is about 0.3 * precision, so for 50 digits
you could set the precision to 256 to Be on the safe side.

And if you are wondering how to compute the binomial coefficients with 'Rmpfr',
here's one possibility to do that:

require(Rmpfr)

mpfrChoose - function(a, b, prec=128) {
m1 - mpfr(1, prec=prec)
# r - gamma(m1+a) / (gamma(m1+b) * gamma(m1+a-b))
# if (is.whole(r)) return(r) else return(round(r))
gamma(m1+a) / (gamma(m1+b) * gamma(m1+a-b))
}

An advantage of using 'Rmpfr' is that the power of R can be applied, for
example vectorization, so avoid 'for' loops if possible:

pr  - 256
m_1 - mpfr(-1, pr)
m1  - mpfr(1, pr)

i - mpfr(0:80, pr)
s - sum((m_1^i) * mpfrChoose(80, i, prec=pr) * (m1-(i+1)*1/200)^200)

print(s, digits=50)
# 1 'mpfr' number of precision  256   bits
# [1] 6.6568524789662037699673275771182135789699510194000e-20

Hans Werner

 
 thanks again for your help


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with the precision of numbers

2010-01-19 Thread Hans W Borchers
 Ted.Harding at manchester.ac.uk writes:

 [...]
 
 I suspect this is an invented computation -- the 3456 strikes
 me as unlikely (it reminds me of my habitual illustrative use
 of set.seed(54321)).
 
 There is a definite problem with the development given by kayj.
 When k=2000 and i=k, the formula requires evaluation of
 
   3456*(2^3000)
 
 on a log10 scale this is
 
   log10(3456) + 3000*log10(2) = 906.6286
 
 Since R gives up at 10^308.25471 = 1.79767e+308
 (10^308.25472 = Inf), this algorithm is going to be tricky to
 evaluate!

Just to finish this off: 'Rmpfr' works fine with this example case and 
computes the total sum in less than 2 seconds:

library(Rmpfr)

j - mpfr(-1, 120)
i - mpfr(0:2000, 120)
s - sum(j^i * 3456 * (1+i*1/2000)^3000)
s
# 1 'mpfr' number of precision  120   bits 
# [1] 2.8876826480125324342665158348085465188e906

which is the same result that Maple returns. But Maple (I admit, mine is
quite old) and 'bc' (couldn't wait for the answer) are much slower at it.

It is said that 'gmp' (on which Rmpfr is based) is faster than any other
multi-precison library on earth --- seems to be true.

Hans Werner


 I don't know how well Rmpfr copes with very large numbers (the
 available documentation seems cryptic). However, I can go along
 with the recommendation in the URL the Ben gives, to use 'bc'
 (Berkeley Calculator), available on unix[oid] systems since
 a long time ago. That is an old friend of mine, and works well
 (it can cope with exponents up to X^2147483647 in the version
 I have). It can eat for breakfast the task of checking whether
 Kate Bush can accurately sing pi to 117 significant figures:
 
   http://www.absolutelyrics.com/lyrics/view/kate_bush/pi
 
 (Try it in R).
 
 Ted.
 
 
 E-Mail: (Ted Harding) Ted.Harding at manchester.ac.uk
 Fax-to-email: +44 (0)870 094 0861
 Date: 19-Jan-10   Time: 18:41:27
 -- XFMail --
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization problem

2010-01-17 Thread Hans W. Borchers
Ravi Varadhan rvaradhan at jhmi.edu writes:

 
 Interesting! 
 
 Now, if I change the cost matrix, D,  in the LSAP formulation slightly
 such that it is quadratic, it finds the best solution to your example:

Dear Ravi,

I thought your solution is ingenious, but after the discussion with 
Erwin Kalvelagen I found two things quite irritating:

(1) Why is solve_LSAP(D) different from solve_LSAP(D^2) in Erwin's
example? I believed just squaring the distance matrix would make
no difference to solving the LSAP - except for some numerical
instability which does not seem to be the case here.

(2) If you change rows and sums in the definition of D, that is

D[j, i] - sqrt(sum((B[, j] - A[, i])^2))

then the solution to Erwin's example comes out right even with
keeping the square root.

Do you have explanations for these 'phenomena'? Otherwise, I think,
there will remain some doubts about this approach.

Very best
Hans Werner


 pMatrix.min - function(A, B) {
 # finds the permutation P of A such that ||PA - B|| is minimum
 # in Frobenius norm
 # Uses the linear-sum assignment problem (LSAP) solver
 # in the clue package
 # Returns P%*%A and the permutation vector `pvec' such that
 # A[pvec, ] is the permutation of A closest to B
   n - nrow(A)
   D - matrix(NA, n, n)
   for (i in 1:n) {
   for (j in 1:n) {
 # D[j, i] - sqrt(sum((B[j, ] - A[i, ])^2))
   D[j, i] - (sum((B[j, ] - A[i, ])^2))  # this is better
   } }
 vec - c(solve_LSAP(D))
 list(A=A[vec,], pvec=vec)
 }
 
   X-pMatrix.min(A,B)
  X$pvec
 [1] 6 1 3 2 4 5
  dist(X$A, B)
 [1] 10.50172
 
 
 This should be fine.  Any counter-examples to this?!
 
 Best,
 Ravi.
 
 
 Ravi Varadhan, Ph.D.
 Assistant Professor,
 Division of Geriatric Medicine and Gerontology
 School of Medicine
 Johns Hopkins University
 
 Ph. (410) 502-2619
 email: rvaradhan at jhmi.edu


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization problem

2010-01-17 Thread Hans W. Borchers
Ravi Varadhan rvaradhan at jhmi.edu writes:

 
 Dear Hans,
 
 I agree with your comments.  My intuition was that the quadratic
 form would be better behaved  than the radical form (less
 nonlinear!?).  So, I was hoping to see a change in behavior when
 the cost function was altered from a radical (i.e. sqrt) form to
 quadratic, but I was still surprised to actually see it.  I don't
 have a good explanation for this.  
 
 I came up with the idea of solving Klaus' problem as an LSAP
 problem.  I did not know that the LSAP approach to this and
 similar problems has already been considered by Nick Higham.
 I asked Nick last night about this problem thinking that he might
 know of more direct solutions to this problem (e.g. some kind of
 SVD or related factorizations). He said that I should take a look
 at the PhD thesis of one of his students.


Thanks for pointing out the thesis. As I understand, the (one-sided)
Procrustes problem is finding an orthogonal matrix minimizing

min! || A R - B || , t(R) R = I , ||.|| the Frobenius norm

and that there is an substantial theory behind in numerical linear
algebra. The basic literature appears to be

Gower, J.C; Dijksterhuis, G.B., Procrustes Problems, Oxford
Statistical Science Series, Oxford University Press, 2004

The thesis itself will lead us astray as it treats the two-sided
Procrustes Problem, which is not our main concern.
The request on R-help only asked for permutation matrices P (from
left or right) minimizing

min! || P A - B ||

so I still think that a direct approach as you have suggested is
possible given this apparently much simpler problem.

Take your definition of D with quadratic terms:
The LSAP approach will find a permutations of the rows of B, say Bp,
such that the (linear!) sum over D_{ip(i)} is minimal, that is

sum over rows {sum d(Bp - A)^2} = sum over all {(Bp - A)^2}

which is exactly square of the Frobenius norm.
If you instead apply your first definition with square roots, it is

sum over rows {sum sqrt(d(Bp - A)^2)}

and this cannot be expanded to the sum of the Frobenius norm.
Therefore, the quadratic approach is indeed correct and will lead
to a true optimum, while the first variant will not.

Hans Werner


 Take a look at Section 3.5 of the PhD thesis
 
 Parallel Solution of SVD-Related Problems, With Applications
 http://www.maths.manchester.ac.uk/~higham/misc/past-students.php
 
 This thesis proposed algorithms for the current problem and
 different versions of it under the heading of Procrustes-type
 problems.  I have to read this and get a better handle on it.  
 I would not be able to get to this for another two weeks. If you
 have any insights from this, in the meanwhile, do share with us.
 
 Best regards,
 Ravi.
 
 
 
 Ravi Varadhan, Ph.D.
 Assistant Professor,
 Division of Geriatric Medicine and Gerontology
 School of Medicine
 Johns Hopkins University
 
 Ph. (410) 502-2619
 email: rvaradhan at jhmi.edu


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to count the total number of (INCLUDING overl apping) occurrences of a substring within a string ?

2009-12-20 Thread Hans W Borchers
Gabor Grothendieck ggrothendieck at gmail.com writes:
 
 Use a zero lookaround expression.  It will not consume its match.  See ?regexp
 
  gregexpr(a(?=a), aaa, perl = TRUE)
 [[1]]
 [1] 1 2
 attr(,match.length)
 [1] 1 1

I wonder how you would count the number of occurrences of, for example,
'aba' or 'a.a' (*) in the string ababacababab using simple lookahead?

In Perl, there is a modifier '/g' to do that, and in Python one could
apply the function 'findall'.

When I had this task, I wrote a small function findall(), see below, but
I would be glad to see a solution with lookahead only.

Regards
Hans Werner

(*) or anything more complex


findall - function(apat, atxt) {
  stopifnot(length(apat) == 1, length(atxt) == 1)
  pos - c()  # positions of matches
  i - 1; n - nchar(atxt)
  found - regexpr(apat, substr(atxt, i, n), perl=TRUE)
  while (found  0) {
pos - c(pos, i + found - 1)
i - i + found
found - regexpr(apat, substr(atxt, i, n), perl=TRUE)
  }
  return(pos)
}


 On Sun, Dec 20, 2009 at 1:43 AM, Jonathan jonsleepy at gmail.com wrote:
  Last one for you guys:
 
  The command:
 
  length(gregexpr('cus','hocus pocus')[[1]])
  [1] 2
 
  returns the number of times the substring 'cus' appears in 'hocus pocus'
  (which is two)
 
  It's returning the number of **disjoint** matches.  So:
 
  length(gregexpr('aa','aaa')[[1]])
   [1] 1
 
  returns 1.
 
  **What I want to do:**
  I'm looking for a way to count all occurrences of the substring, including
  overlapping sets (so 'aa' would be found in 'aaa' two times, because the
  middle 'a' gets counted twice).
 
  Any ideas would be much appreciated!!
 
  Signing off and thanks for all the great assistance,
  Jonathan
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to count the total number of (INCLUDING overl apping) occurrences of a substring within a string ?

2009-12-20 Thread Hans W Borchers
Gabor Grothendieck ggrothendieck at gmail.com writes:
 
 Try this:
 
  findall(aba, ababacababab)
 [1] 1 3 7 9
  gregexpr(a(?=ba), ababacababab, perl = TRUE)
 [[1]]
 [1] 1 3 7 9
 attr(,match.length)
 [1] 1 1 1 1
 
  findall(a.a, ababacababab)
 [1] 1 3 5 7 9
  gregexpr(a(?=.a), ababacababab, perl = TRUE)
 [[1]]
 [1] 1 3 5 7 9
 attr(,match.length)
 [1] 1 1 1 1 1


Thanks --- somehow I did not realize that the expression in  ?=...
can also be regular.

My original problem was to find all three character matches where the
first and the last one are the same.  With  findall()  it works like:

findall((.).\\1, ababacababab)
# [1]  1  2  3  5  7  8  9 10

I am still not able to reproduce this with lookahead. Attempts with

gregexpr((.)?=.\\1, ababacababab, perl = TRUE)

do not work as the lookahead expression apparently does not know about
the captured group from before.

Regards
Hans Werner

Correction: I meant the '\G' metacharacter in Perl, not a modifier.


 On Sun, Dec 20, 2009 at 7:22 AM, Hans W Borchers
 hwborchers at googlemail.com wrote:
  Gabor Grothendieck ggrothendieck at gmail.com writes:
 
  [Sorry; Gmane forces me to delete more quoted text.]
 
  
     findall - function(apat, atxt) {
       stopifnot(length(apat) == 1, length(atxt) == 1)
       pos - c()  # positions of matches
       i - 1; n - nchar(atxt)
       found - regexpr(apat, substr(atxt, i, n), perl=TRUE)
       while (found  0) {
         pos - c(pos, i + found - 1)
         i - i + found
         found - regexpr(apat, substr(atxt, i, n), perl=TRUE)
       }
       return(pos)
     }
  
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset sum problem.

2009-12-09 Thread Hans W Borchers
Geert Janssens janssens-geert at telenet.be writes:

 
 Hi,
 
 I'm quite new to the R-project. I was suggested to look into it because I am 
 trying to solve the Subset sum problem, which basically is:
 Given a set of integers and an integer s, does any non-empty subset sum to s?
 (See http://en.wikipedia.org/wiki/Subset_sum_problem)
 
 I have been searching the web for quite some time now (which is how I 
 eventually discovered that my problem is called subset sum), but I can't seem 
 to find an easily applicable implementation. I did search the list archive, 
 the R website and used the help.search and apropos function. I'm afraid 
 nothing obvious showed up for me.
 
 Has anybody tackled this issue before in R ? If so, I would be very grateful 
 if you could share your solution with me.


Is it really true that you only want to see a Yes or No answer to this
question whether a subset sums up to s --- without learning which numbers
this subset is composed of (the pure SUBSET SUM problem)?
Then the following procedure does that in a reasonable amount of time
(returning 'TRUE' or 'FALSE' instead of Y-or-N):

# Exact algorithm for the SUBSET SUM problem
exactSubsetSum - function(S, t) {
  S - S[S = t]
  if (sum(S)  t) return(FALSE)
  S - sort(S, decreasing=TRUE)
  n - length(S)
  L - c(0)
  for (i in 1:n) {
L - unique(sort(c(L, L + S[i])))
L - L[L = t]
if (max(L) == t) return(TRUE)
  }
  return(FALSE)
}

# Example with a set of cardinality 64
amount - 4748652
products - 
c(30500,30500,30500,30500,42000,42000,42000,42000,
  42000,42000,42000,42000,42000,42000,71040,90900,
  76950,35100,71190,53730,456000,70740,70740,533600,
  83800,59500,27465,28000,28000,28000,28000,28000,
  26140,49600,77000,123289,27000,27000,27000,27000,
  27000,27000,8,33000,33000,55000,77382,48048,
  51186,4,35000,21716,63051,15025,15025,15025,
  15025,80,111,59700,25908,829350,1198000,1031655)

# Timing is not that bad
system.time( sol - exactSubsetSum(products, amount) )
#  user  system elapsed 
# 0.516   0.096   0.673 
sol
# [1] TRUE

 Thank you very much.
 
 Geert


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset sum problem.

2009-12-09 Thread Hans W Borchers
Geert Janssens janssens-geert at telenet.be writes:
 
 On Wednesday 9 December 2009, Hans W Borchers wrote:
  Geert Janssens janssens-geert at telenet.be writes:
   [ ... ]
   Has anybody tackled this issue before in R ? If so, I would be very
   grateful if you could share your solution with me.
 
  Is it really true that you only want to see a Yes or No answer to this
  question whether a subset sums up to s --- without learning which numbers
  this subset is composed of (the pure SUBSET SUM problem)?
  Then the following procedure does that in a reasonable amount of time
  (returning 'TRUE' or 'FALSE' instead of Y-or-N):
 
 Unfortunately no. I do need the numbers in the subset. But thank you for 
 presenting this code.
 
 Geert
 

Okay then, here we go. But don't tell later that your requirement was to
generate _all_ subsets that add up to a certain amount.  I will generate
only one (with largest elements).

For simplicity I assume that the set is prepared s.t. it is decreasingly
ordered, has no elements larger than the amount given, and has a total sum
larger than this amount.

# Assume S decreasing, no elements  t, total sum = t
solveSubsetSum - function(S, t) {
  L - c(0)
  inds - NULL
  for (i in 1:length(S)) {
# L - unique(sort(c(L, L + S[i])))
L - c(L, L+S[i])
L - L[L = t]
if (max(L) == t) {
  inds - c(i)
  t - t - S[i]
  while (t  0) {
K - c(0)
for (j in 1:(i-1)) {
  K - c(K, K+S[j])
  if (t %in% K) break
}
inds - c(inds, j)
t - t - S[j]
  }
  break
}
  }
  return(inds)
}

# former example
amount - 4748652
products - 
c(30500,30500,30500,30500,42000,42000,42000,42000,
  42000,42000,42000,42000,42000,42000,71040,90900,
  76950,35100,71190,53730,456000,70740,70740,533600,
  83800,59500,27465,28000,28000,28000,28000,28000,
  26140,49600,77000,123289,27000,27000,27000,27000,
  27000,27000,8,33000,33000,55000,77382,48048,
  51186,4,35000,21716,63051,15025,15025,15025,
  15025,80,111,59700,25908,829350,1198000,1031655)

# prepare set
prods - products[products = amount]  # no elements  amount
prods - sort(prods, decreasing=TRUE)  # decreasing order

# now find one solution
system.time(is - solveSubsetSum(prods, amount))
#  user  system elapsed 
# 0.320   0.032   0.359 

prods[is]
#  [1]   70740   70740   71190   76950   77382   8   83800
#  [8]   90900  456000  533600  829350 111 1198000

sum(prods[is]) == amount
# [1] TRUE

Note that running times and memory needs will be much higher when more
summands are needed.  To mention that too: I have not tested the code
extensively.

Regards
Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Modula Generators

2009-12-08 Thread Hans W Borchers
Sam K upperhalfplane at yahoo.co.uk writes:
 
 Hi all,
 
 Is there function on R for calculating Modula generators? For example for 
 primes above 100, e.g 157, i want
 to know which number generates the group under multiplication mod 157. i.e  i 
 want to find an element whose
 order is 156. The problem I occur is that modular arithmetic becomes 
 inaccurate when dealing with large numbers.

In other words, you are looking for a 'primitive root' in the field F_p
of prime p. By the way, there are phi(p) of them, where phi is Euler's 
function.

There is no known efficient algorithm for this, except the prime 
factorization of p-1 is given. And this algorithm is probabilistic,
i.e., in some cases it may not terminate.

In case your primes do not get too big, say between 100 and 1000, an
exhaustive search may be possible, somewhat simplified by the Euler
criterion.

Or you locate and read in some tables listing primitive roots. You can
find more information in Wikipedia or MathWorld.

For exact modular arithmetic you may turn to the 'gmp' R package.

Regards
Hans Werner

 Thanks for any help given.
 
 Sam


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solve linear program without objective function

2009-12-04 Thread Hans W Borchers
Andreas Wittmann andreas_wittmann at gmx.de writes:

 
 Dear R-users,
 
 i try to solve to following linear programm in R
 
 0 * x_1 + 2/3 * x_2 + 1/3 * x_3 + 1/3 * x_4 = 0.3
 x_1 + x_2 + x_3 + x_4 = 1
 x_1, x_2, x_3, x_4  0,
 x_1, x_2, x_3, x_4  1
 
 as you can see i have no objective function here besides that i use the 
 following code.
 
 library(lpSolve)
 
 f.obj-c(1,1,1,1)
 f.con-matrix(c(0,2/3,1/3,1/3,
 1,1,1,1,
 1,0,0,0,
 0,1,0,0,
 0,0,1,0,
 0,0,0,1),nrow=6,byrow=TRUE)
 f.dir - c(=, =, , , , )
 f.rhs - c(0.3, 1, 0, 0, 0, 0)
 
 lp (max, f.obj, f.con, f.dir, f.rhs)$solution
 
 the problem is, the condition x_1, x_2, x_3, x_4  0 is not fulfilled.

With strict inequalities x_i  0 your problem will not have a solution. That is
why in linear programming strict inequalities are replaced with non-strict
inequalities, or as the lp_solve manual says:

The inequalities can be =, = or =
 Because all numbers are real values,
 = is the same as  and = is the same as  . 

Try x_i = eps  0 with some eps appropriate for your problem.

Hans Werner

 Any advice would be very helpful.
 
 best regards
 
 Andreas
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Modular inverses

2009-12-01 Thread Hans W Borchers
SJ Robson-Davis sr6827 at bristol.ac.uk writes:

 
 I want to find the inverse of an integer k mod p (prime.) Is there a
 function that can do this for me? I know i could simply write (k^(p-2)) %%
 p, but i need to do this for large primes (above 100) and this gives the
 warning message:
 probable complete loss of accuracy in modulus
 so this method does not work. Any ideas?

Computing the inverse mod p applying brute force may not be a very elegant
approach.  With a little knowledge from number theory it is clear that the
(so-called) extended Euclidean algorithms will provide the modular inverse
efficiently, even for very big numbers.

In R the extended Euclidean algorithm is available in the gmp package, see
the 'gcdex' function.  The modular inverse can also directly be calculated
with the 'inv' function, e.g. the invers of 101 modulo 11 is:

library(gmp)
as.numeric(inv(as.bigz(101), as.bigz(11)))
# [1] 499499501

It's not even necessary that p is prime, only that k and p are relatively
prime.  The integers can be bigger than those handled by double-precision
arithmetics.

Regards
Hans Werner

 Thanks,
 
 Samuel
 
 --


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optim or nlminb for minimization, which to believe?

2009-11-29 Thread Hans W Borchers

Your function named 'gradient' is not the correct gradient. Take as an
example the following point x0, very near to the true minimum,

x0 - c(-0.2517964, 0.4898680, -0.2517962, 0.4898681, 0.7500995)
 
then you get

 gradient(x0)
[1] -0.0372110470  0.0001816991 -0.0372102284  0.0001820976 
0.0144292657

but the numerical gradient is different:

 library(numDeriv)
 grad(fn, x0)
[1] -6.151645e-07 -5.507219e-07  1.969143e-07 -1.563892e-07
-4.955502e-08

that is the derivative is close to 0 in any direction -- as it should be for
an optimum.

No wonder, optim et al. get confused when applying this 'gradient'.

Regards
Hans Werner


Doran, Harold wrote:
 
 I have constructed the function mml2 (below) based on the likelihood
 function described in the minimal latex I have pasted below for anyone who
 wants to look at it. This function finds parameter estimates for a basic
 Rasch (IRT) model. Using the function without the gradient, using either
 nlminb or optim returns the correct parameter estimates and, in the case
 of optim, the correct standard errors.
 
 By correct, I mean they match another software program as well as the
 rasch() function in the ltm package.
 
 Now, when I pass the gradient to optim, I get a message of successful
 convergence, but the parameter estimates are not correct, but they are
 *very* close to being correct. But, when I use nlminb with the gradient, I
 get a message of false convergence and again, the estimates are off, but
 again very close to being correct.
 
 This is best illustrated via the examples:
 
 ### Sample data set
 set.seed(1234)
 tmp - data.frame(item1 = sample(c(0,1), 20, replace=TRUE), item2 =
 sample(c(0,1), 20, replace=TRUE), item3 = sample(c(0,1), 20,
 replace=TRUE),item4 = sample(c(0,1), 20, replace=TRUE),item5 =
 sample(c(0,1), 20, replace=TRUE))
 
 ## Use function mml2 (below) with optim  with use of gradient
 mml2(tmp,Q=10)
 $par
 [1] -0.2438733  0.4889333 -0.2438733  0.4889333  0.7464162
 
 $value
 [1] 63.86376
 
 $counts
 function gradient
   456
 
 $convergence
 [1] 0
 
 $message
 NULL
 
 $hessian
  [,1] [,2] [,3] [,4] [,5]
 [1,] 4.095479 0.00 0.00 0.00 0.00
 [2,] 0.00 3.986293 0.00 0.00 0.00
 [3,] 0.00 0.00 4.095479 0.00 0.00
 [4,] 0.00 0.00 0.00 3.986293 0.00
 [5,] 0.00 0.00 0.00 0.00 3.800898
 
 ## Use same function but use nlminb with use of gradient
 mml2(tmp,Q=10)
 $par
 [1] -0.2456398  0.4948889 -0.2456398  0.4948889  0.7516308
 
 $objective
 [1] 63.86364
 
 $convergence
 [1] 1
 
 $message
 [1] false convergence (8)
 
 $iterations
 [1] 4
 
 $evaluations
 function gradient
   414
 
 
 ### use nlminb but turn off use of gradient
 mml2(tmp,Q=10)
 $par
 [1] -0.2517961  0.4898682 -0.2517961  0.4898682  0.7500994
 
 $objective
 [1] 63.8635
 
 $convergence
 [1] 0
 
 $message
 [1] relative convergence (4)
 
 $iterations
 [1] 8
 
 $evaluations
 function gradient
   11   64
 
 ### Use optim and turn off gradient
 
 mml2(tmp,Q=10)
 $par
 [1] -0.2517990  0.4898676 -0.2517990  0.4898676  0.7500906
 
 $value
 [1] 63.8635
 
 $counts
 function gradient
   227
 
 $convergence
 [1] 0
 
 $message
 NULL
 
 $hessian
[,1]   [,2]   [,3]   [,4]   [,5]
 [1,]  3.6311153 -0.3992959 -0.4224747 -0.3992959 -0.3764526
 [2,] -0.3992959  3.5338195 -0.3992959 -0.3960956 -0.3798141
 [3,] -0.4224747 -0.3992959  3.6311153 -0.3992959 -0.3764526
 [4,] -0.3992959 -0.3960956 -0.3992959  3.5338195 -0.3798141
 [5,] -0.3764526 -0.3798141 -0.3764526 -0.3798141  3.3784816
 
 The parameter estimates with and without the use of the gradient are so
 close that I am inclined to believe that the gradient is correct and maybe
 the problem is elsewhere.
 
 It seems odd that optim seems to converge but nlminb does not with the use
 of the gradient. But, with the use of the gradient in either case, the
 parameter estimates differ from what I think are the correct values. So,
 at this point I am unclear if the problem is somewhere in the way the
 functions are used or how I am passing the gradient or if the problem lies
 in the way I have constructed the gradient itself.
 
 Below is the function and also some latex for those interested in looking
 at the likelihood function.
 
 Thanks for any reactions
 Harold
 
 sessionInfo()
 R version 2.10.0 (2009-10-26)
 i386-pc-mingw32
 
 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252
 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 other attached packages:
 [1] ltm_0.9-2  polycor_0.7-7  sfsmisc_1.0-9  mvtnorm_0.9-8  msm_0.9.4 
 MASS_7.3-3 MiscPsycho_1.5
 [8] statmod_1.4.1
 
 loaded via a namespace (and not attached):
 [1] splines_2.10.0  survival_2.35-7 tools_2.10.0
 
 

Re: [R] Implementation of the Shuffled Complex Evolution (SCE-UA) Algorithm

2009-11-11 Thread Hans W Borchers
Simon Seibert simon.seibert at mytum.de writes:

 
 Good evening list,
 I'm looking for an R implementation of the Shuffled Complex
 Evolution” (SCE-UA) algorithm after Duan et al. (1993). Does anybody
 know if there is an extension/ package existing that contains it?
 Thanks very much for your help! Cheers, Simon

I am looking into stochastic global optimization routines, such as variants of
genetic algorithms (GA), particle swarm optimization (PSO), and shuffled complex
(SCE) or differential evolution (DE).

At the moment I am testing a free Matlab implementation of SCE-UA. If that turns
out to be interesting and effective, I will translate it into an R function,
maybe within the next few weeks -- except, of course, someone else is going to
do an implementation of a similar procedure.

Right now I am not totally convinced. Maybe you try your problem with other
approaches available in R (see e.g. the Optimization task view).

Regards
Hans Werner

 Duan QY, Gupta KV, Sorooshian S (1993) Shuffled Complex Evolution
 Approach for Effective and Efficient Global Minimization. In Jour. of
 optimization theorie and applications. Vol 76, 3, 501-521


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help using R's linprog for LP

2009-10-24 Thread Hans W Borchers
Medha Atre medha.atre at gmail.com writes:

 
 Hi,
 
 I found the reason. By default it puts a condition for x = 0. Is
 there a way to get rid of this condition?


The constraints x = 0 are used in most linear programming realizations.
Some bounds from below are needed. The trick to circumvent the restriction
is as follows:

Assume you know  x = d  where some or all of the d_i can be negative.
Replace x with  y = x - d = 0  and minimize c'y with  Ay = b - A d !
Your solution is then  x = y + d , that is

solveLP(cvec, bvec - Amat %*% dvec, Amat)$solution + dvec

Of course this works with all linear programming packages in R.


[Good theorists] always make an even number of sign errors,
and the bad ones always make an odd number of sign errors.
  --  Anthony Zee in: QFT in a Nutshell


 Medha
 
 On Fri, Oct 23, 2009 at 2:34 PM, Medha Atre medha.atre at gmail.com wrote:
  Hi,
 
  I am using R in one of my courses. I am trying to use R's linprog
  package to solve to formulate 2-class classification problem as Linear
  programming problem.
 
  For my formulation, I need to set to cvec to all 0s.
  I know the points are linearly separable so an optimal solution x
  does exist, which satisfies all the constraints.
 
  But given the constraints and setting cvec to all 0s it simply gives
  me an x of all 0s.
 
  How can I fix this?
 
  Medha
 
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ubuntu, Revolutions, R

2009-10-05 Thread Hans W. Borchers

I updated to Ubuntu 9.10 Beta yesterday, and yes I do see the same message
and I am a bit irritated.  I don't want to read these 'marketing' lines any
time I start up R.
I simply deleted the lines from /etc/R/Rprofile.site for now, but I am 
still wondering who put that in. Is there any deeper reason I didn't get ?

Hans Werner


gunksta wrote:
 
 For those who don't follow Ubuntu development carefully, the first Beta
 for the 
 next Ubuntu was recently released, so I took my home system and upgraded
 to 
 help out with filing bugs, etc. 
 
 Just to be clear, I am not looking for help with the upgrade process. I've
 had 
 R, and a few miscellaneous CRAN packages installed on this computer for
 years. 
 Today, when I loaded an R session I had developed before the upgrade, I
 saw 
 something new in my R welcome message.

R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.


This is REvolution R version 3.0.0:
the optimized distribution of R from REvolution Computing.
REvolution R enhancements Copyright (C) REvolution Computing, Inc.

Checking for REvolution MKL:
  - REvolution R enhancements not installed.
For improved performance and other extensions: apt-get install
revolution-r
 
 The last part, about this being the enhanced version of R was . . . 
 unexpected.  I have heard of this company before and now I've spent some
 time 
 on their website. Looking at my installation, Ubuntu did not install any
 of 
 the REvolution Computing components, although R now basically thows a
 warning 
 every time I start it.
 
 My question(s) for the community is this (pick any question(s) you like to 
 answer: 
 Should I install the REvolution Computing packages? 
 Do these packages really make R faster? 
 Are these packages stable? 
 What are your experiences with REvolution Computing software?
 
 I am interested in hearing from members of the community, REvolution
 Computing 
 employees/supporters (although please ID yourself as such) and most anyone 
 else. I can see what they say on their website, but I'm interested in
 getting 
 other opinions too.
 
 Thanks!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Ubuntu%2C-Revolutions%2C-R-tp25744817p25749786.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Semi continous variable- define bounds using lpsolve

2009-09-22 Thread Hans W. Borchers

I played around a bit with the original 'lp-solve' program --- i.e., not the
R package but the program to be downloaded from Sourceforge ---, at least
version 5.5.0.15 through its IDE.  I was not even able to reproduce the
example on semi-continuous variables in the reference documentation at
http://lpsolve.sourceforge.net/5.5/.

It appears as if a bug has been introduced in newer versions of 'lp-solve'
that prevent these semi-continuous variables from being handled correctly.
So maybe it's not the fault of the R package 'lpsolve' and can only be
amended by the 'lp-solve' team itself.

Hans Werner


pragathichi wrote:
 
 How to define bounds for a semi continous variable in lp_solve.
 Min 5x1 +9x2 +7.15x3 +0.1x4
 subject to 
 x1+x2+x3+x4=6.7
 x1+x4 = 6.5
 And x3 can be 0 or greater than 3.6
 hence x3 is  a semi continous variable 
 how to define bounds as well as semicontinous function because using 
 set.semicont and set. bound simantaneously doesn't seem to work.Thanks in
 advance for the help
 

-- 
View this message in context: 
http://www.nabble.com/Semi-continous-variable--define-bounds-using-lpsolve-tp25530668p25530901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Semi continous variable- define bounds using lpsolve

2009-09-22 Thread Hans W. Borchers

But of course, it is always possible to emulate a semi-continuous variable
by introducing a binary variable and use some big-M trick. That is, with
a new binary variable b we add the following two conditions:

x3 - 3.6 * b = 0  and
x3 - 10 * b  = 0 # Big-M trick, here M = 10

(If b = 0, then x3 = 0, and if b = 1, then x3 = 3.6 !)

As I do not trust 'lpSolve' too much anymore I used package 'Rglpk' with
the following code:

#-- snip ---
library(Rglpk)

obj- c(5, 9, 7.15, 0.1, 0)
mat - matrix(c(1,1,1,1,0, 1,0,0,1,0, 0,0,1,0,-3.6, 0,0,1,0,-10, 0,0,0,0,1),
  byrow=TRUE, ncol=5)
dir - c(==, =, =, =, =)
rhs - c(9, 6.55, 0, 0, 1)
types - c(C, C, C, C, I)
max - FALSE

Rglpk_solve_LP(obj, mat, dir, rhs, types, max = max)
# $optimum
# [1] 22.705
# 
# $solution
# [1] 0.00 2.45 0.00 6.55 0.00
# 
# $status
# [1] 0
#-- snap ---

Semi-continuous variables are sometimes preferred as with a good
implementation the solution is reached much faster (that's why I suggested
them), but they can always be modelled with binary variables.

Hans Werner


pragathichi wrote:
 
 How to define bounds for a semi continous variable in lp_solve.
 Min 5x1 +9x2 +7.15x3 +0.1x4
 subject to 
 x1+x2+x3+x4=6.7
 x1+x4 = 6.5
 And x3 can be 0 or greater than 3.6
 hence x3 is  a semi continous variable 
 how to define bounds as well as semicontinous function because using 
 set.semicont and set. bound simantaneously doesn't seem to work.Thanks in
 advance for the help
 

-- 
View this message in context: 
http://www.nabble.com/Semi-continous-variable--define-bounds-using-lpsolve-tp25530668p25530929.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic Constraints

2009-09-20 Thread Hans W. Borchers

The package lpSolve (that I have recommended before) supports so-called
'semi-continuous variables', that is

Semi-continuous variables are variables that must take a value between
their their minimum and maximum or zero. So these variables are treated
the same as regular variables, except that a value of zero is also
accepted, even if a minimum bigger than zero is set on the variable.

which exactly how you want to handle your variable x3.
For an example, see the documentation at lpsolve.sourceforge.net/5.5/.

By the way, the minimum of your problem is 44.64 (manual calculation).



vikrant S wrote:
 
 HI All,
 I am unable to solve a optimization Problem Please Help Me out of this to
 solve. The Optimization problem is as follows :- 
 My objective function is linear and one of the constraint is quadratic. 
 
 Min z = 5 * X1 + 9* X2  + 7.15 *X3 + 2 * X4
 subject to
  X1 + X2 + X3 +X4  = 9
  X1  + X4  = 6.55
  X3(X3 - 3.5) =0
  X1,X2,X3,X4 =0
  Now the problem is how to solve this kind of problem. Which package
 should be used to handle such problems. Please explain with an example. 
 Another problem is that I have to cases to  be solve in this problem.
 case 1:-) If X3 = 0
 case 2 :-) If X3  0 then X3  3.6
 I want to handle both this case in one problem so the quadratic
 constraints is written
 The thing is that I want to evaluate my objective function for both cases
 and which ever is optimum that solution i need,
 Here I don't want to use the If Else condition and repeat the program. IS
 there any other better way in which i could solve this problem?
 If not please try to provide me the solution for my original problem
 having a quadratic constraint.
 

-- 
View this message in context: 
http://www.nabble.com/Quadratic-Constraints-tp25528480p25529374.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] computing the radius of an arc

2009-07-28 Thread Hans W Borchers
Nair, Murlidharan T mnair at iusb.edu writes:

 
 Alex Brenning, the developer of the RSAGA package told me that and I quote 
 the RSAGA package (which uses functions from the free geographical 
 information system [GIS] SAGA GIS) has a curvature function that is designed 
 to calculate the curvature of surfaces, in particular raster (i.e. gridded) 
 digital elevation models. I am not aware of a function in SAGA GIS or other 
 GIS that would calculate curvatures along a line, especially not in 3D


I have difficulties seeing how you would go about to do this. A surface locally
is characterized by two main curvatures --- this is Gauss' theorem for 2-dim
geometry. The sphere has the same curvature, 1/r, in both directions. So maybe
you can approximate the surface by an ellipsoid, but a strange one as the main
directions may not be orthogonal.

I am not sure this is what you wanted to achieve. As a side remark: The circle
approximating a curve in three dimensions is not uniquely determined.

Regards
Hans Werner


 I shall try to develop it and if I am successful I shall make it available.
 
 Cheers../Murli


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re ading jpeg images?

2009-07-28 Thread Hans W. Borchers

I found the 'biOps' package for Image and data analysis quite helpful.
(I did some astronomical investigations with it --- counting galaxies in
 a Hubble picture---and I do recommend this package.)

Under Windows you have to unpack the 'libjpeg' and 'libtiff' libraries
beforehand somewhere in your path.
See finzi.psych.upenn.edu/Rhelp08/2009-April/194630.html for a link
to download all necessary libraries as a zip-file.

If this link to RapidShare does not work anymore, I think I still have
that file somewhere and could send it to you by e-mail.

Regards
Hans Werner



Robert Biddle wrote:
 
 
Can someone advise me what the most sensible way is to read jpeg
 images?
 
For our work with image analysis and eye-tracking we have been using
 the
rimage package, on both Macs and Windows PCs. But while setting up a
 new
Windows machine yesterday, I see that rimage is regarded as orphaned,
 and no
Windows binary is available. I eventually found an old zip file for the
package,  so  I am not stuck, but I wonder what the right way is to go
forward. I did find the readimages package, but it also seemed
 problematic
to install on both Windows on Mac, requiring extra software that it was
itself unclear how to install.
 
Is there some simpler solution I should be looking at?
 
Are jpeg files so probematic I should be converting them to some other
format and using a different package to read that?
 
 
Thanks
 
Robert Biddle
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/reading-jpeg-images--tp24698332p24700340.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] computing the radius of an arc

2009-07-24 Thread Hans W Borchers
Nair, Murlidharan T mnair at iusb.edu writes:

 
 Hi!!
 
 I am interesting in computing the radius of an arc that best approximates 
 a curve. Is there an R function that I can use to draw an arc? 
 Nothing useful came up when I searched help.search.  Does anyone have any 
 suggestion to do this?
 Thanks ../Murli
 

If you are looking for some formulas to calculate the center and radius
of such a circle/arc, search for 'curvature' in Wikipedia or MathWorld.
Formulas are a bit involved, and you will need to be able to compute
first and second derivatives of the curve.

In case the curve is given as a set of discrete points, you could apply
some polynomial, spline, etc., interpolation.

If you _then_ apply 'draw.arc', watch out to set the aspect ratio to 1,
otherwise your circle may appear to be incorrect.

library(plotrix)
curve(sin(x), 0, pi, col=blue, asp=1)
draw.arc(pi/2, 0, 1, deg1=45, deg2=135, col=red)

Physical meaning of the curvature (and thus the approximation) is that
an observer moving on the curve will feel the same centrifugal forces
as if moving on the circle, at that moment.

Regards
Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R interface to PSwarm

2009-07-23 Thread Hans W Borchers
Dear list:

Being a bit unsatisfied with global optimization approaches in R such as SANN
(in 'optim') or DEoptim, I looked for alternatives on the Web, such as PSwarm
http://www.norg.uminho.pt/aivaz/pswarm/ or PIKAIA.

There is an R interface to PSwarm (version 1.4) on its home page which I was not
able to make run under Windows, getting the error message

Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'character'

when running the example.

Therefore my questions:

  - Has someone successfully tried out PSwarm with R ? 

  - Are there any experiences with PSwarm in general, that is could you find
solutions with PSwarm that you did not find with another GA or with Rdonlp2, for
example ?

Regards
Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automatic differentiation in R

2009-07-23 Thread Hans W. Borchers

Having given a lecture on Numerical Derivatives just a short time ago, I
would like to mention the following:

Many functions, especially in engineering, are not available as formulas
built simply from arithmetical operators and elementary functions.  They are
provided as intricate procedures, as results from simulation runs, or as
return values from optimization routines.

In all these cases, symbolic differentiation is not a possibility at all. 
Still, for further processing there will be a need to differentiate these
functions.

As I understand the literature, Automated Differentiations (AD) is mostly
meant for these kinds of applications(*).  Another push for AD came from
increased interest in the complex-step derivative approximation where
first an analytic continuation -- often nontrivial -- has to be established,
before the derivation can be computed.

It may be different in statistical tasks.  But a general automated
differentiation in R may have to take these kinds of applications into
account.

Regards
Hans Werner

(*) See the Community Portal for AD www.autodiff.org and the definition
of automated differentiation there as an example.


nashjc wrote:
 
 [...]
 
 The clear issue in my mind is that users who need gradients/Jacobians 
 for R want to be able to send a function X to some process that will 
 return another function gradX or JacX that computes analytic 
 derivatives. This has to be easy, which implies a very simple command 
 or GUI interface. I am pretty certain the users have almost no interest 
 in the mechanism, as long as it works. Currently, most use numerical 
 derivatives, not realizing the very large time penalty and quite large 
 loss in accuracy that can compromise some optimization and differential 
 equation codes. I'll try to prepare a few examples to illustrate this 
 and post them somewhere in the next few weeks. Time, as always, ...
 
 However, the topic does appear to be on the table.
 
 JN
 

-- 
View this message in context: 
http://www.nabble.com/Automatic-differentiation-in-R-tp24602805p24634481.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] r-project.org address blacklisted by anti-spam software

2009-07-07 Thread Hans W Borchers
Dear List:

An e-mail mentioning the r-project.org address and sent to a friend at a German
university was considered spam by the local spam filter.

Its reasoning: the URL r-project.org is blacklisted at uribl.swinog.ch resp.
at antispam.imp.ch. I checked the list

http://antispam.imp.ch/swinog-uri-rbl.txt  [caution: long list]

and indeed, there it was. Can anybody explain how or why the R project made its
way onto the list? Is it reasonable to file a protest?

--Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R- NLP on R but ....

2009-07-06 Thread Hans W Borchers
Rahul Varshney itsrahulvarshney at gmail.com writes:

 
 I'll appreciate the help on the following problem:
 
 I solved many Nonlinear programming problems with nonlinear
 constraintsRdonlp is working well but i am unable to get INTEGER data
 with nonlinear constraints in Rdonlp. Is it possible to get Integer Values
 of parameters in any package of R with nonlinear constraints.


If I understand you correctly, you have mixed types of variables, continuous and
integer (or binary?) ones, a non-linear (how much?) objective function and
non-linear constraints (what kind?). 

Then your problem appears to be of MINLP-NL type and the short answer to your
question is: No, there is no package providing such a solver.

In the task Optimization view, under 'MIP' for mixed integer programs you will
find lpsolve, Rglpk, or Rsymphony which are all linear solvers. (R)CPLEX can
solve more general problems, but is commercial (and expensive).

There is small hope that in the future free MIP solvers will be integrated as R
packages, for instance from the COIN-OR initiative.

--Hans Werner


 Rahul
 
   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Equivalent to Matlab's Ans

2009-06-30 Thread Hans W. Borchers

There is a discussion on this topic under the heading A shorter version of
.Last.value on July 7, 2008, see for example

http://www.nabble.com/A-shorter-version-of-%22.Last.value%22--to18322831.html#a18322831

--Hans Werner


Stephane-18 wrote:
 
 Hi everyone,
 I was just wondering if there is an equivalent in R to the shortcut Ans
 in
 MatLab whereby I can use the previous result for the current command? This
 could save a lot of time when hacking in R itself, not from an editor.
 
 Thanks,
 Stephane
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Equivalent-to-Matlab%27s-%22Ans%22-tp24276877p24278620.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization and Linear Programming in R

2009-06-26 Thread Hans W Borchers
 Chris.Wilcox at csiro.au writes:

 Dear List,
 
 [...]
 
 We are looking for a solver that can deal with this nonlinear integer 
 programming problem. We looked at a number of packages on the CRAN Task 
 View: Optimization and Mathematical Programming, however, we have not 
 been able to locate one that will suit our purposes. The essential 
 features are we need to be able to write a nonlinear function for the 
 objective (hopefully a self contained one as we need to include some 
 data in it), we need to be able to use a binary decision (or parameter) 
 vector, and we need to be able to use a constraint. Any suggestions as 
 to packages or other software that will work for our problem would be 
 much appreciated. 

What you desribe appears to be a 'Mixed Integer NonLinear Programming' 
(MINLP) problem. R may not be the right place to look for such kind of 
solvers. You can have a look into the NEOS Optimization Software Guide 
or into the Projects page of the COmputational INfrastructure for
Operations Research (COIN-OR).

The 'Bonmin' COIN-OR project could perhaps provide an appropriate solver 
for this problem. The RINO project on 'R-forge' plans to provide Rbonmin 
and Rlago packages, but this may take some time.

Perhaps by reconsidering your problem approach you can (at least partly) 
linearize your target function or see if it can be made convex, etc.; it 
did not sound to be too complex.

--Hans Werner

P.S.: As far as I know, 'solnp' does not solve mixed integer problems.

 Thanks,
 
 Chris Wilcox and Greg Thonier


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newtons method

2009-05-14 Thread Hans W. Borchers

Dear Ravi:

Thanks for pointing out the homotopy methods. Coming from Mathematics I was
always considering SINGULAR for such a task which is also providing results
when the solution set is not isolated points, but an algebraic variety. 

For single points, homotopy methods appear to be an effective approach. I am
wondering if it will be worth to integrate Jan Verschelde's free PHCpack
algorithm, see http://www.math.uic.edu/~jan/, as a package into R -- if
there would be enough interest.

Best regards,  Hans Werner Borchers



Ravi Varadhan wrote:
 
 Uwe,
 
 John's comment about the difficulties with finding polynomial roots is
 even
 more forceful for a system of polynomials.  There are likely numerous
 roots,
 some possibly real, and some possibly multiple.  Homotopy methods are
 currrently the state-of-art for finding all the roots, but beware that
 they are very time-consuming.   For locating the real roots, I have found
 that a relatively simple approach like multiple random starts works
 failrly well with a root-finder such as dfsane() in the BB package.
 However, I don't know of any tests to check whether I have found all the
 roots.
 
 Ravi.
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: rvarad...@jhmi.edu
 
 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
 

-- 
View this message in context: 
http://www.nabble.com/newtons-method-tp23498698p23535875.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MDL - Fayyad Irani Discretization

2009-04-27 Thread Hans W. Borchers

I have asked for this some years ago and did not got a positive hint.
To my knowledge this has not changed since then.

Brian Ripley proposed to use the 'rpart' algorithm for discretization.
I think I applied the RELIEF-F method at that time  and also wrote a
simplified Fayyad-Irani function on my own.

For RELIEF and other distretizing functions see the 'dprep' package.
On CRAN there is only a version for Linux available, for Windows see
the 'dprep' home page http://math.uprm.edu/~edgar/dprep.html.
(But: I have not tried it out with R 2.9.0 !)

Hans Werner


Piotr Romański wrote:
 
 Hey,
 
 I'm looking for a function which provides a supervised Fayyad  Irani 
 discretization (MDL). I've already found RWeka which has such a 
 function. But is there anything else? I'd prefer to use sth lighter in 
 my own package.
 
 Regards,
 Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/MDL---Fayyad---Irani-Discretization-tp23252045p23252977.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrined dependent optimization.

2009-04-09 Thread Hans W. Borchers

Kevin,

sorry I don't understand this sentence:
It has been suggested that I do a cluster analysis. Wouldn't this bet
mepart way there?

Let your products be numbered  1, ..., n,  and let p(i) be the location
where product i is stored (assigned to) in the warehouse. Then p is a
permutation of the set  1, ..., n  and the task is an assignment problem.

A sale v is simply a subset of  1, ..., n and the contribution to the target
function could, for example, be

f_v(p) = max( p(v) ) - min( p(v) )  for each sale v,

or whatever your evaluation criterion is. The total function to minimize
will then be  Sum( f_v)  for all historical sales v.

Because it is based on a distance, the whole task is called a quadratic
assignment problem (and is certainly not linear). 

In the literature you will find algorithms and tools mentioned to solve such
a task, and n = 20,000 is not that much nowadays. But the difference between 
n = 100  and  n = 20,000  is huge for tools not specialized to the task.

For the TSP, imagine two groups of cities very far apart. Then under some
mild restrictions it would not be reasonable to go back and forth between
these groups several times, and therefore the problem could be split into
two independent subproblems.

So the question of clustering belongs to the data preparation part.

Whether such really different classes of sales exist in your data, only you
can decide. In my experience, most of the time such preprocessing is not
worth doing and will not speed up the optimization procedure considerably.

Regards,  Hans Werner



rkevinburton wrote:
 
 
 It has been suggested that I do a cluster analysis. Wouldn't this bet
 mepart way there?
 
 Thank you for your response I am looking up the book now.
 
 Kevin
 
  Hans W. Borchers hwborch...@googlemail.com wrote:

 Just in case you are still interested in theoretical aspects:

 In combinatorial optimization, the problem you describe is known as the
 Quadratic (Sum) Assignment Problem (QAP or QSAP) and is well known to
 arise in facility and warehouse layouts. The task itself is considered
 hard, comparable to the Traveling Salesman Problem (TSP).

 I would clearly advise to turn to some specialized commercial software
 for solving it, otherwise you will very likely get stuck in suboptimal
 solutions miles away from the true optimum. And for a real commercial
 situation this may be disastrous.

 Regards,  Hans Werner

 P.S.:   See for instance Cela: The Quadratic Assignment Problem. Theory
 and Algorithms, Springer, 2000.  (Partly available at books.google.com)

 
 

-- 
View this message in context: 
http://www.nabble.com/Constrined-dependent-optimization.-tp22772520p22966416.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrined dependent optimization.

2009-04-04 Thread Hans W. Borchers

Just in case you are still interested in theoretical aspects:

In combinatorial optimization, the problem you describe is known as the
Quadratic (Sum) Assignment Problem (QAP or QSAP) and is well known to
arise in facility and warehouse layouts. The task itself is considered
hard, comparable to the Traveling Salesman Problem (TSP).

I would clearly advise to turn to some specialized commercial software
for solving it, otherwise you will very likely get stuck in suboptimal
solutions miles away from the true optimum. And for a real commercial
situation this may be disastrous.

Regards,  Hans Werner

P.S.:   See for instance Cela: The Quadratic Assignment Problem. Theory
and Algorithms, Springer, 2000.  (Partly available at books.google.com)



rkevinburton wrote:
 
 I have decided to use this SANN approach to my problem but to keep the run
 time reasonable instead of 20,000 variables I will randomly sample this
 space to get the number of variables under 100. But I want to do this a
 number of times. Is there someone who could help me set up WINBUGS to
 repeat this optimization N times each time randomly picking 100 of the
 possible 20,000.
 
 Comments?
 
 Kevin
 

-- 
View this message in context: 
http://www.nabble.com/Constrined-dependent-optimization.-tp22772520p22883010.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrined dependent optimization.

2009-03-30 Thread Hans W. Borchers

Image you want to minimize the following linear function

f - function(x) sum( c(1:50, 50:1) * x / (50*51) )

on the set of all permutations of the numbers 1,..., 100.

I wonder how will you do that with lpSolve? I would simply order
the coefficients and then sort the numbers 1,...,100 accordingly.

I am also wondering how optim with SANN could be applied here.

As this is a problem in the area of discrete optimization resp.
constraint programming, I propose to use an appropriate program
here such as the free software Bprolog. I would be interested to
learn what others propose.

Of course, if we don't know anything about the function f then
it amounts to an exhaustive search on the 100! permutations --
probably not a feasible job.

Regards,  Hans Werner



Paul Smith wrote:
 
 On Sun, Mar 29, 2009 at 9:45 PM,  rkevinbur...@charter.net wrote:
 I have an optimization question that I was hoping to get some suggestions
 on how best to go about sovling it. I would think there is probably a
 package that addresses this problem.

 This is an ordering optimzation problem. Best to describe it with a
 simple example. Say I have 100 bins each with a ball in it numbered
 from 1 to 100. Each bin can only hold one ball. This optimization is that
 I have a function 'f' that this array of bins and returns a number. The
 number returned from f(1,2,3,4) would return a different number from
 that of f(2,1,3,4). The optimization is finding the optimum order of
 these balls so as to produce a minimum value from 'f'.I cannot use the
 regular 'optim' algorithms because a) the values are discrete, and b) the
 values are dependent ie. when the variable representing the bin
 location is changed (in this example a new ball is put there) the
 existing ball will need to be moved to another bin (probably swapping
 positions), and c) each variable is constrained, in the example above
 the only allowable values are integers from 1-100. So the problem becomes
 finding the optimum order of the balls.

 Any suggestions?
 
 If your function f is linear, then you can use lpSolve.
 
 Paul
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Constrined-dependent-optimization.-tp22772520p22782922.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] asking advice for Integer Programming packages

2009-03-29 Thread Hans W. Borchers

Looking into the 'Optimization' task view you will also identify the 
Rsymphony package. A short overview of these packages and a benchmark 
based on a test suite can be found in

  http://www.rmetrics.org/Meielisalp2008/Presentations/Theussl2.pdf.

Different problems may have quite different running times applying the
underlying procedures. So the question is not so much about ease of use,
but how well they fit with the problem.

Regards, Hans Werner


dl7631 wrote:
 
 Dear everyone,
 
 I don't know much about Integer Programming but am afraid I am facing
 a problem that can only be solved via Integer Programming. I was
 wondering if those of you who have experience with it could recommend
 an R package.
 I found the following R packages:
 
 Rglpk
 glpk
 lpSolve
 lpSolveAPI
 
 Are there any others?
 Are some of them easier to use than others for a beginner?
 
 Any advice would be greatly appreciated!
 
 
 -- 
 Dimitri Liakhovitski
 MarketTools, Inc.
 dimitri.liakhovit...@markettools.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/asking-advice-for-Integer-Programming-packages-tp22750572p22766608.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rdonlp2 -Query

2009-03-07 Thread Hans W. Borchers
Leo Guelman leo.guelman at gmail.com writes:

 
 Hi,
 
 Did anyone used this package? Could you please share your thought on it?


What do you, exactly, mean with share your thought on it? It has its pros and
cons, as always.
Sure Rdonlp2 has been used, and it has been requested and discussed several
times here on the list. It is mentioned on the 'Optimization' task view, too. An
RSite search might help.


 Thanks!
 
 L.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Self-Organizing Map analysis

2009-03-04 Thread Hans W. Borchers
glaporta glaporta at freeweb.org writes:

 
 Dear list,
 I read the SOM package manual but I don't understand how to perform (for
 example) 1) the SOM analysis on Iris data 2) with a visualization similar to
 that of figure 7 in
 http://www.cis.hut.fi/projects/somtoolbox/package/papers/techrep.pdf


In R, I used the following calls to create SOM maps similar to the ones in the
literature (I don't find the SOM toolbox figures particularly inspiring):


  library(class)
  data(iris)

  n - 10
  sg - somgrid(n, n, rectangular)
  df.som - batchSOM(df, sg, c(4,4,3,3,2,2,1,1))

  windows(record=TRUE)
  for (i in 1:ncol(df)) {
xy - matrix(df.som$codes[,i], n, n)
image(xy)
contour(xy, add=T)
grid(col=gray)
  }


Of course, there are not enough Iris data to generate a reasonable SOM grid. At
least  n - 30  is needed for a nice graph, and that means at least 1000 data
points.

Using a Binhex grid instead of a rectangular one will result in diagrams similar
to those from the SOM toolbox. But then the 'image' call has to be a bit
prepared for this.

Regards,  Hans Werner


 Any suggestion? Thanks in advance,
 Gianandrea

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R - MATLAB apply like function

2009-03-04 Thread Hans W. Borchers
ARDIA David david.ardia at unifr.ch writes:

 
 Dear all,
 I very often use the R function apply, for speedup purposes. I am now also 
 using MATLAB, and would like to use the same kind of function.
 I have already asked MATLAB people, and the answer is : vectorize... but of 
 course, this is not always possible. So, instead of using a FOR loop all the 
 time, I tried using the bsxfun. 

The only other Matlab functions I can come up with of are 'arrayfun',
'structfun', and 'cellfun'. For instance,

  arrayfun(@(x)isequal(x.f1, x.f2), S)

applies an anonymous function to a structure array.

But think of it, loops in Matlab are much faster than in R -- especially since
version 2008a --, so the need for avoiding them is not as big.

Regards,  Hans Werner

 So you R people, who might also use MATLAB, do you have any solution? Are you 
 also nervous about this fact (maybe also because of this R-MATLAB question).
 Thanks for your help,
 Dave 
 
   [[alternative HTML version deleted]]
 
 __
 R-help at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset Regression Package

2009-02-17 Thread Hans W. Borchers
Alex Roy alexroy2008 at gmail.com writes:

 
 Dear all ,
   Is there any subset regression (subset selection
 regression) package in R other than leaps?


Lars and Lasso are other 'subset selection' methods, see the corresponding
packages 'lars' and 'lasso2' and its description in The Elements of Statistical
Learning.
Also, 'dr', Methods for dimension reduction for regression, or  'relaimpo',
Relative importance of regressors in linear models, can be considered.


 Thanks and regards
 
 Alex


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Outlier Detection for timeseries

2009-02-14 Thread Hans W. Borchers
Pele drdionc at yahoo.com writes:

 
 
 Hello R users,
 
 Can someone tell if there is a package in R that can do outlier detection
 that give outputs simiilar to what I got from SAS  below.
 
 Many thanks in advance for any help!


I guess you are talking about the OUTLIER procedure in SAS that attempts
to detect 'additive outliers' and 'level shifts' in a 'response' series,
the second following Jong  Penzer's Diagnosing shocks in time series.

I have not come across this method in R, but you might have a look into the
'robfilter' (Robust Time Series Filters) package with functions like
'dw.filter', 'adore.filter', or 'wrm.filter', see for instance

dw.filter is suitable for extracting low frequency components (the
signal) from a time series which may be contaminated with outliers
and can contain level shifts. For this, moving window techniques are 
applied. 

If your time series is actually a response, you might prefer to look at
the series of residuals instead.


   Outlier Details
 
 Approx
 Chi-
 Prob
ObsTime ID Type   Estimate  Square
 ChiSq
 
 12   12.00Additive   2792544.6  186.13   
 .0001
 13   13.00Additive   954302.1   21.23   
 .0001
 15   15.00Shift  63539.3   
 9.060.0026


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] general inverse solver?

2009-02-10 Thread Hans W. Borchers
Gabor Grothendieck ggrothendieck at gmail.com writes:

 
 Yacas was completely rewritten in java (Ryacas interfaces to the
 C version) since the Ryacas project started so I would not exactly
 characterize yacas as dead.   The work that is going on in yacas
 may not have high visibility but that does not mean there is none.
 
 Also while Maxima is more sophisticated in terms of algorithms,
 yacas is actually more sophisticated from the viewpoint of its
 language  which borrows ideas from both imperative and prolog programming
 and its interfaces are more sophisticated (it is one of the few CAS systems
 that developed an OpenMath interface) and its socket server is
 used by the Ryacas interface.  yacas can also translate math expressions
 to TeX and do exact arithmetic.
 
 Also to put this in the correct context, yacas does seem capable of
 answering the majority of questions that are posed on r-help that need
 a CAS in the answer. From a practical viewpoint it does seem to have
 the facilities that are most often needed.   The Ryacas vignette has
 a survey of some of its algebra capabilities.


I do not agree with this statement. Looking back into some of the requests for
symbolic computations on this help list, [R]Yacas showed off weaknesses such as:

 - extremely slow in high-precision arithmetic
 - not solving very high-dimensional polynomials
 - does not simplify expressions in several variables
 - does not solve systems of equations except linear ones
 - does not symbolically solve medium to complex integrals
 - has very few special functions
 - therefore differentiation is unsatisfying

From a statistics point of view or application this may not be so relevant, but
considering the requests it illustrates that Yacas is not providing necessary
CAS features. Especially the missing capabilities in symbolic integration can be
painful in a statistics environment.

Many times Ryacas was pointed out, it was actually not solving the problem.

The examples in the Ryacas vignette are -- in my opinion -- trivial ones and do
not display advanced capabilities.

This is not meant to diminish the efforts invested in the Yacas software over
the years. I know a bit how difficult it is to write software for symbolic
computations. But in my opinion it also is evidence that a more advanced
computer algebra module could be very valuable for extending and improving the R
system.

Hans W. Borchers


 That being said, without taking away from yacas there is work going on to
 interface R to a second CAS.
 
 On Tue, Feb 10, 2009 at 2:33 AM, Hans W. Borchers
 hwborchers at googlemail.com wrote:
 
  I know that Ryacas is promoted here whenever requests about symbolic algebra
  or calculus appear on the R-help list. But to say the truth, Yacas itself is
  a very very limited Computer Algebra System and looking onto its home page
  it appears the development will stop or has stopped anyway.
 
  It would be fair to clearly state that there is no R package to solve
  somewhat more involved symbolic mathematical problems. One could then point
  the requestor to one of the open source Computer Algebra Systems  (CAS) such
  as Maxima or Axiom.
 
  Interestingly, the free Math Toolbox Euler by Grossmann has integrated
  Maxima into its numerical environment in a way that is really useful for
  numerical and symbolic computations. I could imagine that in a similar way
  Maxima can be integrated into R bringing the full power of computer algebra
  to the R community.
 
  Hans W. Borchers
  ABB Corporate Research
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] general inverse solver?

2009-02-10 Thread Hans W. Borchers
Gabor Grothendieck ggrothendieck at gmail.com writes:

 
 I am not sure what the point of all this is.


The point of all this is:

1.  Some users of R here on the list apparently would like to have more powerful
CAS functionalities than Yacas can provide.

2. Many of the solution hints to Ryacas on this list were simply not solving the
resp. problem, so the 80/20 statement of yours may be downright wrong.

3. Maxima does incorporate a socket server and can be integrated into other
systems, see EULER (do you never listen?).

What are you mourning about? Of course, it was great that you made available
Yacas for R. But others have the right to discuss their wishes here w/o being
'slammed' every time with questionable hints to Ryacas.

I agree arguments have been exchanged and we should stop here. After this
discussion, I wish someone would think about an Rmaxima package (along the lines
of EULER, e.g.), I cant do it, unfortunately.

Hans Werner Borchers


 This is an R list, not a CAS list.  The recommendation to use yacas is based 
 on
 the fact  that there is an interface between R and yacas.  There is no
 interface between
 R and Maxima so Maxima is not in the running.  Anyone who has used
 Maxima knows that
 is an impressive piece of software but that's not the point.
 
 Regarding, why there is no interface to Maxima, its because its harder
 to interface to
 Maxima than yacas.   There are two problems here:
 
 1. Maxima does not incorporate a socket server as far as I know.
 You would have to write it and that may or may not need an in depth
 understanding of Maxima to do so but in any case represents work.
 With yacas you don't have to write a server since yacas itself already
 contains a server. Just run:
   yacas --server .,.
 and the server side is done. (In the case of yacas it would also be
 possible to use its C interface for an in-process interface or presumably
 the java interface of the new java version of yacas.  With Maxima
 it would be more problematic since its written in Lisp.)
 
 2. Once you have created some sort of communications channel
 then what?  If you want more than a crude interface that passes
 an unprocessed character string to the CAS and then passes one
 back then you will want to translate between R and the CAS.  With
 yacas, OpenMath facilitates this greatly.
 
 The end result is that its more work to meaningfullly interface with
 Maxima than with yacas yet yacas satisfies the majority of
 needs of a CAS.  There is the minority who need more powerful
 algorithms but I think it was a reasonable step to handle the 80%
 that can be accommodated most easily first by using yacas. The
 remaining 20%, which is what the other respondents to this thread
 are discussing will in part be addressed in the future possibly, in
 part, by the second project I am working on now.  Perhaps some
 would quibble with the 80/20 and if you are in the 20 it probably
 seems like 100 but we will never settle that question definitively.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] general inverse solver?

2009-02-09 Thread Hans W. Borchers

I know that Ryacas is promoted here whenever requests about symbolic algebra
or calculus appear on the R-help list. But to say the truth, Yacas itself is
a very very limited Computer Algebra System and looking onto its home page
it appears the development will stop or has stopped anyway.

It would be fair to clearly state that there is no R package to solve
somewhat more involved symbolic mathematical problems. One could then point
the requestor to one of the open source Computer Algebra Systems  (CAS) such
as Maxima or Axiom.

Interestingly, the free Math Toolbox Euler by Grossmann has integrated
Maxima into its numerical environment in a way that is really useful for
numerical and symbolic computations. I could imagine that in a similar way
Maxima can be integrated into R bringing the full power of computer algebra
to the R community.

Hans W. Borchers
ABB Corporate Research


Postscript

The Euler Mathematical Toolbox is a powerful, versatile, and open source
software for numerical and symbolic computations ... Euler supports symbolic
mathematics using the open algebra system Maxima.

http://mathsrv.ku-eichstaett.de/MGF/homes/grothmann/euler/



Gabor Grothendieck wrote:
 
 The forms of equations are limited but its not limited to just one:
 
 library(Ryacas)
 Loading required package: XML
 x - Sym(x)
 y - Sym(y)
 Solve(List(x+y == 2, x-y == 0), List(x, y))
 [1] Starting Yacas!
 expression(list(list(x == 2 - y, y == 1)))
 
 
 On Mon, Feb 9, 2009 at 7:45 PM, Carl Witthoft c...@witthoft.com wrote:
 Gabor G a ecrit:
 Check out the Ryacas package.   There is a vignette with some
 examples.

 
 Which led me to the manuals for yacas itself.  I'm guessing there may be
 a
 way to use yacas'  AND construct to combine a few equations and then
 hope
 the Newton Solver can work with that, but it's not clear that will work.

 TK!Solver is nice because you aren't limited to linear equations, nor to
 equations which fit into a matrix structure, and because it's legal to
 have more than one unknown to be back-solved (assuming the problem is not
 under- or over-defined, of course).

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

-- 
View this message in context: 
http://www.nabble.com/general-inverse-solver--tp21902788p21928972.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Event sequence analysis

2009-02-03 Thread Hans W. Borchers
Jean-Pierre Müller jean-pierre.mueller at unil.ch writes:

 Maybe TraMineR ? http://mephisto.unige.ch/traminer
 
 HTH,

Thanks for pointing out TraMineR; it is a very interesting package on 
sequences of events and states in social science. Unfortunately, it 
appears not provide the kind of windowing or counting of subsequences 
that I was asking for. 

Therefore, I am still looking for implementations of procedures for 
event sequence analysis as provided in some data mining tools and 
applications, see for example the article cited below. 

Any help would be very much appreciated.

Thanks, Hans Werner Borchers


Le 2 févr. 09 à 13:31, Hans W. Borchers a écrit :
Dear R help,

I am analyzing sequences of events described by time and a unique event 
tag. And I am searching for recurring patterns where patterns have to 
show up in a certain time window, e.g. 5 or 10 minutes. Of course, 
inbetween these events other events may occur. 

I have applied basket analysis approaches like apriori or 'frequent item 
set' algorithms with interesting results but these methods do not take 
into account the exact succession of events. I also looked into the 
'Generalized Sequential Pattern' function of Weka, but the 
implementation in Weka does not allow for a time window (as far as I 
understand). 

Are there other sequence analysis implementations available in R? -- 
For instance in the realm of the 1997 paper Discovery of frequent 
episodes in event sequences by H. Mannila et al. 

Please no BioConductor hints as they are meaning something different 
with (genetic) sequence analysis. 


Very best,  Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Event sequence analysis

2009-02-02 Thread Hans W. Borchers
Dear R help,

I am analyzing sequences of events described by time and a unique event tag. And
I am searching for recurring patterns where patterns have to show up in a
certain time window, e.g. 5 or 10 minutes. Of course, inbetween these events
other events may occur.

I have applied basket analysis approaches like apriori or 'frequent item set'
algorithms with interesting results but these methods do not take into account
the exact succession of events. I also looked into the 'Generalized Sequential
Pattern' function of Weka, but the implementation in Weka does not allow for a
time window (as far as I understand).

Are there other sequence analysis implementations available in R? -- For
instance in the realm of the 1997 paper Discovery of frequent episodes in event
sequences by H. Mannila et al.

Please no BioConductor hints as they are meaning something different with
(genetic) sequence analysis.

Very best,  Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get solution of following polynomial?

2009-01-12 Thread Hans W. Borchers
RON70 ron_michael70 at yahoo.com writes:

 
 
 Hi Ravi, Thanks for this reply. However I could not understand meaning of
 vectorizing the function. Can you please be little bit elaborate on that?
 Secondly the package polynomial is not available in CRAN it seems. What is
 the alternate package?
 
 Thanks,


Ravi means 'PolynomF' which is an improved version of the old polynomial
package.

You do not need to recreate the polynomial from points. Instead, calculate 
the exact polynomial:

library(PolynomF)
z - polynom()

p11 - 1 - A1[1,1]*z - A2[1,1]*z^2 - A3[1,1]*z^3 - A4[1,1]*z^4
# ...
p - p11*p22 - p12*p21

There is probably a shorter way to generate these four polynoms p11, ..., p22.
Anyway, the result is

p
# 1 - 1.18*x + 0.2777*x^2 - 0.2941*x^3 - 0.1004*x^4 + 0.3664*x^5 -
# 0.0636*x^6 + 0.062*x^7 - 0.068*x^8

solve(p)
# [1] -1.365976+0.00i -0.737852-1.639581i -0.737852+1.639581i
# [4] -0.012071-1.287727i -0.012071+1.287727i  1.00+0.00i
# [7]  1.388794-0.281841i  1.388794+0.281841i

and the real solutions are 1.0 and -1.365976 !

Regards,  Hans Werner


 Ravi Varadhan wrote:
  
  Hi,
  
  You can use the polynomial package to solve your problem.
  
  The key step is to find the exact polynomial representation of fn(). 
  Noting that it is a 8-th degree polynomial, we can get its exact form
  using the poly.calc() function.  Once we have that, it is a simple matter
  of finding the roots using the solve() function.
  
  require(polynomial)
  
  a - c(-0.07, 0.17)
  b - c(1, -4)
  cc - matrix(c(0.24, 0.00, -0.08, -0.31), 2)
  d - matrix(c(0, 0, -0.13, -0.37), 2)
  e - matrix(c(0.2, 0, -0.06, -0.34), 2)
  A1 - diag(2) + a %*% t(b) + cc
  A2 - -cc + d
  A3 - -d + e
  A4 - -e
  
  # I am vectorizing your function
  fn - function(z)
 {
  sapply(z, function(z) {
  y - diag(2) - A1*z - A2*z^2 - A3*z^3 - A4*z^4
  det(y)
  })
 }
  
  
  x - seq(-5, 5, length=9) # note we need 9 points to exactly determine a
  8-th degree polynomial
  y - fn(x)
  
  p - poly.calc(x, y)  # uses Lagrange interpolation to determine
  polynomial form
  p
  1 - 1.18*x + 0.2777*x^2 - 0.2941*x^3 - 0.1004*x^4 + 0.3664*x^5 -
  0.0636*x^6 + 0.062*x^7 - 0.068*x^8 
  
  # plot showing that p is the exact polynomial representation of fn(z)
  pfunc - as.function(p)
  x1 -seq(-5, 5, length=100)
  plot(x1, fn(x1),type=l)
  lines(x1, pfunc(x1), col=2, lty=2)   
  
  solve(p)  # gives you the roots (some are, of course, complex)
  
  
  Hope this helps,
  Ravi.
  
  
  
  Ravi Varadhan, Ph.D.
  Assistant Professor,
  Division of Geriatric Medicine and Gerontology
  School of Medicine
  Johns Hopkins University
  
  Ph. (410) 502-2619
  email: rvaradhan at jhmi.edu
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Execution of a .sce file through R

2008-12-31 Thread Hans W. Borchers
ykank spicyankit4u at gmail.com writes:

 
 
 Dear R Users
 
 Does some has any idea about how to execute a scilab file(.sce file) from
 the Terminal in R.
 Any kind of guidance would be highly welcomed and appreciated.

Have you looked into your Scilab directory? In mine -- under Windows XP and
Scilab version 5.0.3 -- there is no executable file 'scilab.exe'. The file to
call is 'Scilex.exe' (and 'WScilex.exe' for the GUI) which can also easily been
seen by following the path of the Scilab link in the Start menu.

The following call works for me:

system(path-to-scilab/bin/Scilex -f filename.sce)

where I have to end the 'sce' file with a quit statement.

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting ISO week

2008-12-11 Thread Hans W. Borchers
Gabor Grothendieck ggrothendieck at gmail.com writes:

 
 format(d, %U) and format(d, %W) give week numbers using
 different conventions.  See ?strptime

Gabor,

the results of format(aDate, W) appear to be incorrect anyway, see:

format(as.Date(2008-01-01), %W) #- 00

There is never a week 0, this should be week 1.

format(Sys.Date(), %W)#- 49

but my business calendar says today's (Dec. 11, 2008) week is week 50
which is what Brian Ripleys proposed 'strftime(x, %V)' returns.

There could be a format %E (not used up to now) for returning a
correct week number according to the European standard.

Yours,  Hans Werner

 On Thu, Dec 11, 2008 at 7:43 AM, Gustaf Rydevik
 gustaf.rydevik at gmail.com wrote:
  Hi all,
 
  Is there a simple function already implemented for getting the ISO
  weeks of a Date object?
  I couldn't find one, and so wrote my own function to do it, but would
  appreciate a pointer to the default way. If a function is not yet
  implemented, could the code below be of interest to submit to CRAN?
 
  Best Regards,
 
  Gustaf
 
  
 
 ... [rest deleted]
 __
 R-help at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting ISO week

2008-12-11 Thread Hans W. Borchers
Gabor Grothendieck ggrothendieck at gmail.com writes:

 
 According to the definition in ?strptime (which is not the same as the
 ISO definition):
 
 format(x, %W) returns
 
 Week of the year as decimal number (00–53) using Monday as the first
 day of week (and typically with the first Monday of the year as day 1
 of week 1). The UK convention.
 
 The first day of 2008 is a Tuesday which means that 2008 starts in week 0.

Yes I read that but it is still misleading and -- I think -- incorrect. 
See www.dateandtime.org/calendar to find out that this is week 50 even 
in the UK.
We would have had a lot of misplaced business meetings in our company if 
the week numbers in Great Britain, Germany, and Sweden would actually be 
different.

Hans Werner

 
  ... [rest deleted]
 __
 R-help at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Hans W. Borchers
Harsh singhalblr at gmail.com writes:

 
 Hello All,
 I am trying to carry out variable reduction. I do not have information
 about the dependent variable, and have only the X variables as it
 were.
 ...
 I looked for other R packages that allow me to do variable reduction
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.

I doubt that you will find what you are longing for, but: There is a Windows
version available at the Homepage of the drep package at
http://math.uprm.edu/~edgar/dprep.html. 
This version 2.0 can be loaded without errors into R 2.8.0 though it appears 
not to be fully compliant with the tests on CRAN.

 Moreover, I have a dataset that contains continuous and categorical
 variables, some categorical variables having 3 levels, 10 levels and
 so on, till a max 50 levels (E.g. States in the USA).
 
 Any suggestions in this regard will be much appreciated.
 
 Thank you
 
 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to solve following equation?

2008-12-02 Thread Hans W. Borchers
 I need to solve a equation like this :

 a = b/(1+x) + c/(1+x)^2 + d/(1+x)^3

 where a,b,c,d are known constant. Is there any R-way to do that?

Multiplying this expression with (1+x)^3 leads to a polynomial equation.
I would certainly recommend the 'PolynomF' package here:


# install.packages(PolynomF)
library(PolynomF)

options(digits=16)

x - polynom()

a - b - c - d - 1
p - a*(1+x)^3 - b*(1+x)^2 - c*(1+x) - d
p
# -2 + 2*x^2 + x^3

solve(p)

# [1] -1.419643377607080-0.6062907292072i
# -1.419643377607080+0.6062907292072i
# [3]  0.839286755214161+0.0i


The solution x0 = 0.839286755214161 is correct up to the last digit, as can be
verified by using a computer algebra system. This also shows that Ryacas is
quite exact in this task.

Hans Werner

Gabor Grothendieck ggrothendieck at gmail.com writes:
 
 Assume a = 1.  If not set b = b/a, etc.
 Now use (1) uniroot
 
  f - function(x) b + c/(1+x) + d/(1+x)^2 - 1 - x
  uniroot(f, 0:1)
 $root
 [1] 0.8392679
 
 $f.root
 [1] 3.049818e-05
 
 $iter
 [1] 3
 
 $estim.prec
 [1] 6.103516e-05
 
 or multiply through by 1+x
 and subtract 1 from both sides giving
 x = b + c/(1+x) + d/(1+x)^2 - 1
 and iterate that.
 
  a - b - c - d - 1
  x - 0
  for(i in 1:25) {
 + x - b + c/(1+x) + d/(1+x)^2 - 1
 + print(x)
 + }
 [1] 2
 [1] 0.444
 [1] 1.171598
 [1] 0.6725419
 [1] 0.9553676
 [1] 0.7729558
 [1] 0.8821595
 [1] 0.8135892
 [1] 0.8554268
 [1] 0.829437
 [1] 0.8454056
 [1] 0.835527
 [1] 0.8416126
 [1] 0.837854
 [1] 0.8401717
 [1] 0.838741
 [1] 0.8396236
 [1] 0.839079
 [1] 0.839415
 [1] 0.8392076
 [1] 0.8393356
 [1] 0.8392566
 [1] 0.8393053
 [1] 0.8392753
 [1] 0.8392938
 
 On Mon, Dec 1, 2008 at 9:47 PM, RON70 ron_michael70 at yahoo.com wrote:
 
  I need to solve a equation like this :
 
  a = b/(1+x) + c/(1+x)^2 + d/(1+x)^3
 
  where a,b,c,d are known constant. Is there any R-way to do that?
 
  Thanks in advance
  --
  View this message in context:
http://www.nabble.com/How-to-solve-following-equation--tp20785063p20785063.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help at r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization problem

2008-12-01 Thread Hans W. Borchers

Why not use one of the global optimizers in R, for instance 'DEoptim', and
then apply optim() to find the last six decimals? I am relatively sure that
the Differential Evolution operator has a better chance to come near a
global optimum than a loop over optim(), though 'DEoptim' may be a bit slow
(only for quite large numbers of parameters).

Regards,  Hans Werner



Mike Prager wrote:
 
 tedzzx [EMAIL PROTECTED] wrote:
 
 
 If I want to find out the globle minia, how shoul I change my code?
 
 I sometimes use optim() within a loop, with random starting
 values for each iteration of the loop. You can save the
 objective function value each time and pick the best solution.
 Last time I did that, I ran it 100 times.
 
 That procedure does not guarantee finding the global minimum.
 However, it does make it *more likely* to find the global minmum
 *within the range of your starting values*.
 
 Often, I make a boxplot of the various results. If they don't
 show a strong mode, there is a data or model problem that needs
 to be addressed. For example, the solution may be poorly defined
 by the data, or the model may be specified with confounded
 parameters.
 
 -- 
 Mike Prager, NOAA, Beaufort, NC
 * Opinions expressed are personal and not represented otherwise.
 * Any use of tradenames does not constitute a NOAA endorsement.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/optimization-problem-tp20730032p20784256.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Examples of advanced data visualization

2008-11-29 Thread Hans W. Borchers
Tom Backer Johnsen backer at psych.uib.no writes:

 [...]
 The question is interesting, but what I have a somewhat negative 
 reaction to is the next passage:
  
  Please answer to my e-mail address. In case enough interesting material 
  comes up, I will enter a summary here.
 
 It is nice that you are willing to summarize whatever appears, but 
 somewhat arrogant in my eyes.  There might be things appearing that you 
 do not regard as of first interest that might be of interest to others. 
   Therefore, the all parts of the discussion or responses should be 
 public as well.  The response of David Winsemius pointing at (among 
 other things) at the presentation of Rosling at TED is in my eyes a very 
 good start.
 
 In other words, I therefore suggest that the list ignores the last 
 paragraph in the question from you.
  
  Hans Werner Borchers
  ABB Corporate Research
 
 Tom
 

Tom is right here. What I should have said / intended to say was something like
You can _also_ answer to my e-mail address I feel sorry that this
misstatement caused a distraction from my request.

Coming back to the original question: We all know the many pages about R
graphics and its numerous features and skills. Therefore, I am more interested
in data visualizations not yet implemented or made available in R.

I am also wondering if the R Wiki would be a better place to publish summaries
on topics discussed here. On the mailing list, summaries are forgotten within
one or two months time, only to be retrieved in specific searches.

Comments on any of the issues mentioned are welcome.

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Examples of advanced data visualization

2008-11-28 Thread Hans W. Borchers
Dear R-help,

I am looking for ideas and presentations of new and advanced data visualization
methods. As an example of what I am searching for, the 'Many Eyes' pages at

http://manyeyes.alphaworks.ibm.com/manyeyes/

may provide a good paradigm. I would be interested even if it will not be easy
to implement such examples in R, e.g. because of the interactive nature of these
graphical displays.

Please answer to my e-mail address. In case enough interesting material comes
up, I will enter a summary here.

Hans Werner Borchers
ABB Corporate Research

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which data structure in R can mimic hash in PERL best?

2008-11-15 Thread Hans W. Borchers

 Dear all,

Which data structure in R can mimic hash in PERL? I'd like to set
 up a lookup table, which could be accomplished by HASH if using PERL.
 Which of the data structures in R is the most efficient for lookup
 table?
 Thanks for your help.
 
 Best regards,
 Leon

The regular answer to this is named arrays/vectors or environments 
and has been given several times here on R-help. Unfortunately, 
everybody is applying a different terminology such as 'maps', 'hashes', 
'tables' or 'dictionaries', etc., so it's difficult to search for those 
entries.

Below I enclose a solution that I have written some month ago. It is 
given in pythonic language, but you can easily translate it into the 
corresponding Perl terms. One drawback is that one cannot use numbers
as keys, only keys following the naming conventions for variable names
are allowed.

Hans Werner Borchers
ABB Corporate Research


#-- Define functions on Hash Tuples (Python alike) -

def.h - function() new.env(hash=TRUE)
len.h - function(dict) length(ls(envir=dict))
set.h - function(key, val, dict) assign(key, val, envir=dict)
get.h - function(key, dict, default=NULL) {
if (exists(key, envir=dict)) { get(key, dict)
} else { default } 
}
has_key - function(key, dict) exists(key, envir=dict)
keys.h - function(dict) ls(envir=dict)
items.h - function(dict) as.list(dict)
values.h - function(dict, mode='character') {
l - as.list(dict)
n - length(l)
if (n==0) invisible(NULL)
v - vector('character', n)
for (i in 1:n) v[i] - l[[i]]
if (mode=='numeric') v - as.numeric(v)
return(v)
}
del.h - function(key, dict) {
if (exists(key, envir=dict)) {
val - get.h(key, dict)
rm(list=c(key), envir=dict)
} else {
val - NULL
}
invisible(val)
}
clear.h - function(dict) {
rm(list=keys.h(dict), envir=dict)
}
#---

-- 
View this message in context: 
http://www.nabble.com/Which-data-structure-in-R-can-mimic-hash-in-PERL-best--tp20515707p20517058.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strsplit (regex)

2008-11-11 Thread Hans W. Borchers
stephen sefick ssefick at gmail.com writes:

 
 #how do I break these up into first two letters (RM), number, and then
 the last part
 #is there an easily accessible regex tutorial on the internet?

For regular expressions, the perl man pages at http://perldoc.perl.org/
perlre.html are quite good and present the essentials in condensed form, but 
still very useful. (Some constructs may not work outside of Perl.)

A more elaborate tutorial is to be found at Regular-Expressions.info, i.e. 
http://www.regular-expressions.info/. Of course, there are many, many more, 
see the Open Directory for one long list. And each programming language has its 
own page on regular expressions.

At the Regular Expression Library http://regexlib.com you can search for and 
copy regular expressions, for example there are 20 patterns returned when 
searching for regular expressions on 'floats'.

Hans Werner Borchers

 v = (structure(1:122, ...
 
 thanks


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ttda and text-mining

2008-11-10 Thread Hans W. Borchers
LE PAPE Gilles lepape.gilles at neuf.fr writes:

 
 The ttda package was devoted to text-mining. It seems to be no more
 available. Did the name change ? Are other packages devoted to text-mining?

The 'ttda' package is deprecated as is noted on the Task View Natural Language
Processing. The old code is still available if you follow the link. Still I
would propose to convert to 'tm' instead. Another possibility is the 'ReadMe'
package at http://gking.harvard.edu/readme/ for automated content analysis.

Hans Werner Borchers

 Many thanks
 
 Gilles
 
   [[alternative HTML version deleted]]
 
 __
 R-help at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to do knn regression?

2008-09-28 Thread Hans W. Borchers
 This is a summary of discussions between Shengqiao Li and me,
 entered here as a reference for future requests on knn regression
 or missing value imputation based on a nearest neighbor approach.

There several functions that can be used for 'nearest neighbor'
classification such as knn, knn1 (in package class), knn3(caret),
kknn(kknn), ipredknn(ipred), sknn(klaR), or gknn(cba).

To utilize these functions for 'nearest neighbor' regression would be
difficult. There is actually just one knn-like functions that can be
applied to continuous variables:

kknn(kknn)
 uses a formula and looks at the type of the target variable:
 if the target variable is continuous will return a regression
 result for each row in the learning set

And two implementations of functions that simply return the indices
and distances of k nearest neighbors for further processing:

ann(yaImpute)
constructs kd- or bd-trees to find k nearest neighbors
and returns indices and distances of those neighbors
(it may kill the whole R process when matrices are too big)
[Remark: Watch out, default distance is sum of squares]

knnFinder(knnFinder)
constructs a kd-tree to find the k nearest neighbors;
has too many bugs and quirks to make it almost unusable;
not maintained anymore (perhaps should be removed from CRAN)

The other approach is to use a distance function and sort 'manually'
to find the nearest neighbors and their values for the target variable.

'dist' itself is not really appropriate as it can only be applied to 
_one_ matrix where here we need something like dist(A, B). Combining
A and B into one matrix is often forbidden as it needs too much memory.

dists(cba)
computes a distance matrix between rows of two matrices
can be a bit slow for very big matrices (slower than 'dist')
[Rem: default distance is square root of sum of squares]

I would appreciate to hear from you when I missed something.

Hans Werner Borchers
ABB Corporate Research

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to do knn regression?

2008-09-19 Thread Hans W. Borchers
Shengqiao Li shli at stat.wvu.edu writes:
 
 Hello,

 I want to do regression or missing value imputation by knn. I searched 
 r-help mailing list. This question was asked in 2005. ksmooth and loess 
 were recommended. But my case is different. I have many predictors 
 (p20) and I really want try knn with a given k. ksmooth and loess use 
 band width to define neighborhood size. This contrasts to knn's variable 
 band width via fixing a k. Are there any such functions I can use in R 
 packages? 
 

The R package 'knnFinder' provides a nearest neighbor search based on the 
approach through kd-tree data structures. Therefore, it is extremely fast 
even for very large data sets. It returns as many neighbors as you need 
and can also be used, e.g., for determining distance-based outliers.

Hans Werner Borchers
ABB Corporate Research

 
 Your help is highly appreciated.
 
 Shengqiao Li
 
 __
 R-help at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find a shift between two curves or data sets

2008-09-19 Thread Hans W. Borchers

Dear Sébastien,

identifying similarity in curves or time series is one of the main tasks 
in the quite recent field of 'Functional Data analysis' (FDA). See the 
2005 book by Silverman from Springer Verlag or the corresponding Web page 
at url{http://www.psych.mcgill.ca/misc/fda/}.

The 'fda' package on CRAN contains functions developed by Silverman and 
Ramsey and described in that book. You may also be interested in the 'dtw' 
package that provides functions for comparing times series under time 
'warping', i.e., distortions along the time axis.

Hans Werner Borchers
ABB Corporate Research



Sébastien Durand wrote:
 
 Hello,
 
 I have a issue here!
 
 I need to find the offset or shift between two data sets.
 
 Each data set does not start nor end at the same time and dont even have 
 the same sampling interval (which by the way isn't constant in any of 
 the data set).
 
 It must be known that the data expressed in both the data set are 
 comming from the same sensor so they should be the same!
 
 What I am looking for is a way to minimize the differences between the 
 two curves by applying an x offset.
 
 Would any body have a clue how to perform such action, I simply have no 
 idea were to start.
 
 Thank you very very much.
 
 S.
 

-- 
View this message in context: 
http://www.nabble.com/How-to-find-a-shift-between-two-curves-or-data-sets-tp19561272p19567696.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convex optimization in R?

2008-09-11 Thread Hans W. Borchers
Hesen Peng hesen.peng at gmail.com writes:

 
 Hi my R buddies,
 
 I'm trying to solve a specific group of convex optimization in R. The
 admissible region is the inside and surface of a multi-dimensional
 eclipse area and the goal function is the sum of absolution values of
 the variables. Could any one please tell me whether there's a package
 in R to do this? Thank you very much,


To my knowledge there does not exist a designated R package for convex
optimization. Also, in the Optimization task view the AMS nomenclature
90C25 for Convex programming (CP) is not mentioned.

On the other hand, this task view may give you some ideas on how to
solve your problem with one of the available optimization packages.
For instance, a problem including sums of absolute values can be
modeled as a linear program with mixed integer variables (MILP).

There is a free module for 'disciplined' convex optimization, CVX, that
can be integrated with Matlab or Python. Hopefully, there will be a CVX
R package in the future (as has been announced/promised).

Hans Werner Borchers
ABB Corporate Research


 Best wishes,
 
 --
 Hesen Peng
 http://hesen.peng.googlepages.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exponential smoothing?

2008-08-19 Thread Hans W. Borchers
Öhagen Patrik Patrik.Ohagen at mpa.se writes:
 
 Dear List,
 
 I have used all my resources (i.e. help.search) and I still havn't been
 able to figure out if there is an Exponential Smoothing command in R.

A few weeks ago the book Forecasting with Exponential Smoothing by 
Hyndman et al. has appeared http://www.exponentialsmoothing.net/.

In connection with this book the 'forecasting' bundle is available
from the CRAN package repository, containing functions and data sets
described in the book.

//  Hans Werner Borchers

 Thank you in advance!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nonlinear constrained optimization

2008-08-19 Thread Hans W. Borchers
Paul Smith phhs80 at gmail.com writes:
 
 Up to my best knowledge, R cannot deal with optimization problems with
 nonlinear constraints, unless one uses the penalty method. Outside R,
 Ipopt and Algencan can solve problems like yours, but one needs to
 program in AMPL and/or C/Fortran.
 
 Paul
 

Please have a look at the 'Rdonlp2' package at http://arumat.net/Rdonlp2/.
It provides non-linear optimization with nonlinear constraints.

For more information on advanced optimization procedures available in R take
a short trip to the Optimization Task View.

// Hans Werner Borchers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exporting adaBoost model

2008-08-18 Thread Hans W. Borchers

One way to port these kinds of models between applications is the 
Predictive Model Markup Language (PMML). The R package 'PMML' supports 
linear regression, rpart, SVM, and others, not adaBoost. On the other
side, not even the Python machine learning library Orange does have
an import function for PMML. 

Perhaps a more attractive possibility is to call R functions from Python
through the 'Rpy' http://rpy.sourceforge.net/ Python interface to R.
You could send data from Python to R for being handled by adaBoost and
get back the results. Of course, R and Python need be installed on your
machine and you cannot generate a single executable.,

//  Hans Werner Borchers


Bob Flagg wrote:
 
 Dear all,
 
 I'm using adaBoost from the ada package to build a classification
 model.  After training the model in R I'd like to use it in a
 Python application.  Is it possible to export the model in some
 way to make translating into python easier?  
 
 Any help would be greatly appreciated. Thanks.
 Bob
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/exporting-adaBoost-model-tp19005415p19026563.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FastICA

2008-08-12 Thread Hans W. Borchers
Maura E Monville maura.monville at gmail.com writes:

 
 Is the FastICA R implementation as good as the MatLab Implementation ?
 I would appreciate talking to someone who has used FastICA for R.

The fastICA packages for Matlab and R (and there is even a version for Python)
have a common origin at the Helsinki University of Technology. I regularly use
Matlab and R, not seeing much of a difference in packages like these.

//  Hans Werner Borchers

 
 Thank you very much.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FastICA

2008-08-12 Thread Hans W. Borchers
Prof Brian Ripley ripley at stats.ox.ac.uk writes:
 
 On Tue, 12 Aug 2008, someone with no signature wrote:
 
  Maura E Monville maura.monville at gmail.com writes:
 
 
  Is the FastICA R implementation as good as the MatLab Implementation ?
  I would appreciate talking to someone who has used FastICA for R.
 
  The fastICA packages for Matlab and R (and there is even a version for 
  Python) have a common origin at the Helsinki University of Technology.
 
 Have you actually looked at the R one?  The code has no connection with 
 Helsinki University of Technology.  'Credit where credit is due' and all 
 that.

Oh yes, you are right and I am sorry.

I started working with fastICA for Matlab and referring to the page

http://www.cis.hut.fi/projects/ica/fastica/

and its uncommented link to fastICA in R I always assumed it uses the same
code base (never looking at 'credits').

Well, then you are the right person to answer this question.

  Matlab and R, not seeing much of a difference in packages like these.
 
  //  Hans Werner Borchers
 
 Or quite possibly someone else using Mr Borchers name.

Not really.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FastICA

2008-08-12 Thread Hans W. Borchers

I may not have been as wrong as Prof. Ripley suggested when I wrote The
fastICA packages for Matlab and R (...) have a common origin at the Helsinki
University of Technology.

Please consider the following lines from the 'fastICA' help page (?fastICA):

   FastICA algorithm

   Description:

This is an R and C code implementation of the FastICA algorithm
of Aapo Hyvarinen et al. (URL: http://www.cis.hut.fi/aapo/)
to perform Independent Component Analysis (ICA) and Projection
Pursuit.

So the C code base is not the same, I guess, but The code has no connection
with Helsinki University of Technology does not seem to be totally correct.

//  Hans Werner Borchers



Prof Brian Ripley wrote:
 
 On Tue, 12 Aug 2008, someone with no signature wrote:
 
 Maura E Monville maura.monville at gmail.com writes:


 Is the FastICA R implementation as good as the MatLab Implementation ?
 I would appreciate talking to someone who has used FastICA for R.

 The fastICA packages for Matlab and R (and there is even a version for
 Python)
 have a common origin at the Helsinki University of Technology. I
 regularly use
 
 Have you actually looked at the R one?  The code has no connection with 
 Helsinki University of Technology.  'Credit where credit is due' and all 
 that.
 
 Matlab and R, not seeing much of a difference in packages like these.

 //  Hans Werner Borchers
 
 Or quite possibly someone else using Mr Borchers name.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 

-- 
View this message in context: 
http://www.nabble.com/FastICA-tp18928778p18940982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re cursive root finding

2008-08-08 Thread Hans W. Borchers

As your curve is defined by its points, I don't see any reason to 
artificially apply functions such as 'uniroot' or 'optim' (being a 
real overkill in this situation).

First smooth the curve with splines, Savitsky-Golay, or Whittacker 
smoothing, etc., then loop through the sequence of points and compute 
the gradient by hand, single-sided, two-sided, or both.

At the same time, mark those indices where the gradient is zero or 
changes its sign; these will be the solutions you looked for.

With your example, I immediate got as maxima or minima:

x1 = 1.626126
x2 = 4.743243
x3 = 7.851351

//  Hans Werner Borchers


Any comments? Maybe the problem was not clear or looked too specific.  
I'll add a more bare bones example, if only to simulate discussion:

 x - seq(1,10,length=1000)
 set.seed(11) # so that you get the same randomness
 y - jitter(sin(x),a = 0.2)
 values - data.frame(x= x,  y = y)

 findZero - function (guess, neighbors, values)
 {
   
 smooth - smooth.spline(values$x, values$y)

 obj - function(x) {
 (predict(smooth, x)$y) ^2
 }
 minimum - which.min(abs(values$x - guess))
 optimize(obj, interval = c(values$x[minimum - neighbors],
  values$x[minimum + 
 neighbors]))  # uniroot could be used  
 instead i suppose

 }

 test - findZero(guess = 6 ,  neigh = 50, values = values)
 plot(x,y)
 abline(h=0)
 abline(v=test$minimum, col=red)

Now, I would like to find all (N=)3 roots, without a priori knowledge  
of their location in the interval. I considered several approaches:

1) find all the numerical values of the initial data that are close to  
zero with a given threshold. Group these values in N sets using cut()  
and hist() maybe? I could never get this to work, the factors given by  
cut confuse me (i couldn't extract their value). Then, apply the  
function given above with the guess given by the center of mass of the  
different groups of zeros.

2) apply findZero once, store the result, then add something big  
(1e10) to the neighboring points and look for a zero again and repeat  
the procedure until N are found. This did not work, I assume because  
it does not perturb the minimization problem in the way I want.

3) once a zero is found, look for zeros on both sides, etc... this  
quickly makes a complicated decision tree when the number of zeros  
grows and I could not find a clean way to implement it.

Any thoughts welcome! I feel like I've overlooked an obvious trick.

Many thanks,

baptiste


On 7 Aug 2008, at 11:49, baptiste auguie wrote:

 Dear list,


 I've had this problem for a while and I'm looking for a more general  
 and robust technique than I've been able to imagine myself. I need  
 to find N (typically N= 3 to 5) zeros in a function that is not a  
 polynomial in a specified interval.

 The code below illustrates this, by creating a noisy curve with  
 three peaks of different position, magnitude, width and asymmetry:

 x - seq(1, 10, length=500)
 exampleFunction - function(x){ # create some data with peaks of  
 different scaling and widths + noise
  fano - function (p, x)
  {
  y0 - p[1]
  A1 - abs(p[2])
  w1 - abs(p[3])
  x01 - p[4]
  q - p[5]
  B1 - (2 * A1/pi) * ((q * w1 + x - x01)^2/(4 * (x - x01)^2 +
  w1^2))
  y0 + B1
  }
  p1 - c(0.1, 1, 1, 5, 1)
  p2 - c(0.5, 0.7, 0.2, 4, 1)
  p3 - c(0, 0.5, 3, 1.2, 1)
  y - fano(p1, x) + fano(p2, x) + fano(p3, x)
  jitter(y, a=0.005*max(y))
 }

 y - exampleFunction(x)

 sample.df - data.frame(x = x, y = y)

 with(sample.df, plot(x, y, t=l)) # there are three peaks, located  
 around x=2, 4 ,5
 y.spl - smooth.spline(x, y) # smooth the noise
 lines(y.spl, col=red)


 I wish to obtain the zeros of the first and second derivatives of  
 the smoothed spline y.spl. I can use uniroot() or optim() to find  
 one root, but I cannot find a reliable way to iterate and find the  
 desired number of solutions (3 peaks and 2 inflexion points on each  
 side of them). I've used locator() or a guesstimate of the disjoints  
 intervals to look for solutions, but I would like to get rid off  
 this unnecessary user input and have a robust way of finding a  
 predefined number of solutions in the total interval. Something  
 along the lines of:

 findZeros - function( f , numberOfZeros = 3, exclusionInterval =  
 c(0.1 , 0.2, 0.1)
 {
 #
 # while (number of solutions found is less than numberOfZeros)
 # search for a root of f in the union of intervals excluding a  
 neighborhood of the solutions already found (exclusionInterval)
 #
 }

 I could then apply this procedure to the first and second  
 derivatives of y.spl, but it could also be useful for any other  
 function.

 Many thanks for any remark of suggestion!

 baptiste


-- 
View this message in context: 
http://www.nabble.com/recursive-root-finding-tp18868013p18894331.html
Sent from the R 

<    1   2   3   >