Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-07 Thread William Dunlap via R-help
Using log1p instead of log improves the accuracy of the 'subtract xmax'
algorithm when the largest x is very close to zero.  Perhaps that is what
is missing in the Rf_logspace_add.

test <- function (x) {
x <- as.numeric(x)
i <- which.max(x)
rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000),
  Rf_logspace_sum = logspace_sum(x),
  subtract_xmax_log1p = log1p(sum(exp(x[-i] - x[i]))) + x[i],
  subtract_xmax_naive = log(sum(exp(x - x[i]))) + x[i],
  naive = log(sum(exp(x
}

> test(c(-1e-50, -46, -47))
[,1]
Rmpfr1.4404614986241e-20
Rf_logspace_sum -1.0e-50
subtract_xmax_log1p  1.4404614986241e-20
subtract_xmax_naive -1.0e-50
naive0.0e+00
> test(c(0, -46, -47))
   [,1]
Rmpfr   1.4404614986241e-20
Rf_logspace_sum 0.0e+00
subtract_xmax_log1p 1.4404614986241e-20
subtract_xmax_naive 0.0e+00
naive   0.0e+00


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Nov 7, 2016 at 11:09 AM, William Dunlap  wrote:

> It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
> Rf_logspace_sub were available as R functions.  (I wish the '_sub' were
> '_subtract' because 'sub' means too many things in R.)
>
> I think Rf_logspace_sum in R could be a little better. E.g., using the C
> code
>
> #include 
> #include 
> #include 
>
> SEXP Call_logspace_sum(SEXP x)
> {
> if (TYPEOF(x) != REALSXP)
> {
> Rf_error("'x' must be a numeric vector");
> }
> return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
> }
>
> and the R functions
>
> logspace_sum <- function (x) .Call("Call_logspace_sum", as.numeric(x))
>
> and
>
> test <- function (x) {
> x <- as.numeric(x)
> rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x,
> precBits=5000),
>   Rf_logspace_sum = logspace_sum(x),
>   subtract_xmax = log(sum(exp(x - max(x + max(x),
>   naive = log(sum(exp(x
> }
>
>
> R-3.3.2 on Linux gives, after options(digits=17)
> > test(c(0, -50))
>   [,1]
> Rmpfr   1.9287498479639178e-22
> Rf_logspace_sum 1.9287498479639178e-22
> subtract_xmax   0.e+00
> naive   0.e+00
>
> which is nice, but also the not so nice
>
> > test(c(0, -50, -50))
>   [,1]
> Rmpfr   3.8574996959278356e-22
> Rf_logspace_sum 0.e+00
> subtract_xmax   0.e+00
> naive   0.e+00
>
> With TERR the second test has Rmpfr==Rf_logspace_sum for that example.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <
> maech...@stat.math.ethz.ch> wrote:
>
>> > William Dunlap via R-help 
>> > on Sun, 6 Nov 2016 20:53:17 -0800 writes:
>>
>> > Perhaps the C function Rf_logspace_sum(double *x, int n) would help
>> in
>> > computing log(b).  It computes log(sum(exp(x_i))) for i in 1..n,
>> avoiding
>> > unnecessary under- and overflow.
>>
>> Indeed!
>>
>> I had thought more than twice to also export it to the R level
>> notably as we have been using two R level versions in a package
>> I maintain ('copula'). They are vectorized there in a way that
>> seemed particularly useful to our (Marius Hofert and my) use cases.
>>
>> More on this -- making these available in R, how exactly? --
>> probably should move to the R-devel list.
>>
>> Thank you Bill for bringing it up!
>> Martin
>>
>> > Bill Dunlap
>> > TIBCO Software
>> > wdunlap tibco.com
>>
>> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <
>> r.tur...@auckland.ac.nz> wrote:
>>
>> >> On 07/11/16 13:07, William Dunlap wrote:
>> >>
>> >>> Have you tried reparameterizing, using logb (=log(b)) instead of
>> b?
>> >>>
>> >>
>> >> Uh, no.  I don't think that that makes any sense in my context.
>> >>
>> >> The "b" values are probabilities and must satisfy a "sum-to-1"
>> >> constraint.  To accommodate this constraint I re-parametrise via a
>> >> "logistic" style parametrisation --- basically
>> >>
>> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>> >>
>> >> with the parameters that the optimiser works with being z_1, ...,
>> z_{n-1}
>> >> (and with z_n == 0 for identifiability).  The objective function
>> is of the
>> >> form sum_i(a_i * log(b_i)), so I transform back
>> >> from the z_i to the b_i in order calculate the value of the
>> objective
>> >> function.  But when the z_i get moderately large-negative, the b_i
>> become
>> >> numerically 0 and then log(b_i) becomes -Inf.  And the optimiser
>> falls over.
>> >>
>> >> cheers,
>> >>
>> >> Rolf
>> >>
>> >>
>> >>> Bill Dunlap
>> >>> TIBCO 

Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-07 Thread William Dunlap via R-help
It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
Rf_logspace_sub were available as R functions.  (I wish the '_sub' were
'_subtract' because 'sub' means too many things in R.)

I think Rf_logspace_sum in R could be a little better. E.g., using the C
code

#include 
#include 
#include 

SEXP Call_logspace_sum(SEXP x)
{
if (TYPEOF(x) != REALSXP)
{
Rf_error("'x' must be a numeric vector");
}
return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
}

and the R functions

logspace_sum <- function (x) .Call("Call_logspace_sum", as.numeric(x))

and

test <- function (x) {
x <- as.numeric(x)
rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000),
  Rf_logspace_sum = logspace_sum(x),
  subtract_xmax = log(sum(exp(x - max(x + max(x),
  naive = log(sum(exp(x
}


R-3.3.2 on Linux gives, after options(digits=17)
> test(c(0, -50))
  [,1]
Rmpfr   1.9287498479639178e-22
Rf_logspace_sum 1.9287498479639178e-22
subtract_xmax   0.e+00
naive   0.e+00

which is nice, but also the not so nice

> test(c(0, -50, -50))
  [,1]
Rmpfr   3.8574996959278356e-22
Rf_logspace_sum 0.e+00
subtract_xmax   0.e+00
naive   0.e+00

With TERR the second test has Rmpfr==Rf_logspace_sum for that example.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler 
wrote:

> > William Dunlap via R-help 
> > on Sun, 6 Nov 2016 20:53:17 -0800 writes:
>
> > Perhaps the C function Rf_logspace_sum(double *x, int n) would help
> in
> > computing log(b).  It computes log(sum(exp(x_i))) for i in 1..n,
> avoiding
> > unnecessary under- and overflow.
>
> Indeed!
>
> I had thought more than twice to also export it to the R level
> notably as we have been using two R level versions in a package
> I maintain ('copula'). They are vectorized there in a way that
> seemed particularly useful to our (Marius Hofert and my) use cases.
>
> More on this -- making these available in R, how exactly? --
> probably should move to the R-devel list.
>
> Thank you Bill for bringing it up!
> Martin
>
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
>
> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner 
> wrote:
>
> >> On 07/11/16 13:07, William Dunlap wrote:
> >>
> >>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
> >>>
> >>
> >> Uh, no.  I don't think that that makes any sense in my context.
> >>
> >> The "b" values are probabilities and must satisfy a "sum-to-1"
> >> constraint.  To accommodate this constraint I re-parametrise via a
> >> "logistic" style parametrisation --- basically
> >>
> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
> >>
> >> with the parameters that the optimiser works with being z_1, ...,
> z_{n-1}
> >> (and with z_n == 0 for identifiability).  The objective function is
> of the
> >> form sum_i(a_i * log(b_i)), so I transform back
> >> from the z_i to the b_i in order calculate the value of the
> objective
> >> function.  But when the z_i get moderately large-negative, the b_i
> become
> >> numerically 0 and then log(b_i) becomes -Inf.  And the optimiser
> falls over.
> >>
> >> cheers,
> >>
> >> Rolf
> >>
> >>
> >>> Bill Dunlap
> >>> TIBCO Software
> >>> wdunlap tibco.com 
> >>>
> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
> r.tur...@auckland.ac.nz
> >>> > wrote:
> >>>
> >>>
> >>> I am trying to deal with a maximisation problem in which it is
> >>> possible for the objective function to (quite legitimately) return
> >>> the value -Inf, which causes the numerical optimisers that I have
> >>> tried to fall over.
> >>>
> >>> The -Inf values arise from expressions of the form "a * log(b)",
> >>> with b = 0.  Under the *starting* values of the parameters, a must
> >>> equal equal 0 whenever b = 0, so we can legitimately say that a *
> >>> log(b) = 0 in these circumstances.  However as the maximisation
> >>> algorithm searches over parameters it is possible for b to take the
> >>> value 0 for values of
> >>> a that are strictly positive.  (The values of "a" do not change
> during
> >>> this search, although they *do* change between "successive
> searches".)
> >>>
> >>> Clearly if one is *maximising* the objective then -Inf is not a
> value
> >>> of
> >>> particular interest, and we should be able to "move away".  But the
> >>> optimising function just stops.
> >>>
> >>> It is also clear that "moving away" is not a simple task; you can't
> 

Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-07 Thread Paul Gilbert


I am trying to deal with a maximisation problem in which it is possible
for the objective function to (quite legitimately) return the value
-Inf,


(Just to add to the pedantic part of the discuss by those of us that do 
not qualify as younger and wiser:)


Setting log(0) to -Inf is often convenient but really I think the log 
function is undefined at zero, so I would not refer to this as "legitimate".



which causes the numerical optimisers that I have tried to fall over.


In theory as well as practice. You need to have a function that is 
defined on the whole domain.




The -Inf values arise from expressions of the form "a * log(b)", with b
= 0.  Under the *starting* values of the parameters, a must equal equal
0 whenever b = 0, so we can legitimately say that a * log(b) = 0 in


This also is undefined and not "legitimate". I think there is no reason 
it should be equal zero. We tend to want to set it to the value we think 
of as the "limit": for a=0 the limit as b goes to zero would be zero, 
but the limit of a*(-inf) is -inf as a goes to zero.


So, you really do need to avoid zero because your function is not 
defined there, or find a redefinition that works properly at zero. I 
think you have a solution from another post.


Paul


these circumstances.  However as the maximisation algorithm searches
over parameters it is possible for b to take the value 0 for values of
a that are strictly positive.  (The values of "a" do not change during
this search, although they *do* change between "successive searches".)

Clearly if one is *maximising* the objective then -Inf is not a value of
particular interest, and we should be able to "move away".  But the
optimising function just stops.

It is also clear that "moving away" is not a simple task; you can't
estimate a gradient or Hessian at a point where the function value is -Inf.

Can anyone suggest a way out of this dilemma, perhaps an optimiser that
is equipped to cope with -Inf values in some sneaky way?

Various ad hoc kludges spring to mind, but they all seem to be fraught
with peril.

I have tried changing the value returned by the objective function from
"v" to exp(v) --- which maps -Inf to 0, which is nice and finite.
However this seemed to flatten out the objective surface too much, and
the search stalled at the 0 value, which is the antithesis of optimal.

The problem arises in a context of applying the EM algorithm where the
M-step cannot be carried out explicitly, whence numerical optimisation.
I can give more detail if anyone thinks that it could be relevant.

I would appreciate advice from younger and wiser heads! :-)

cheers,

Rolf Turner

-- Technical Editor ANZJS Department of Statistics University of
Auckland Phone: +64-9-373-7599 ext. 88276


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-07 Thread Martin Maechler
> William Dunlap via R-help 
> on Sun, 6 Nov 2016 20:53:17 -0800 writes:

> Perhaps the C function Rf_logspace_sum(double *x, int n) would help in
> computing log(b).  It computes log(sum(exp(x_i))) for i in 1..n, avoiding
> unnecessary under- and overflow.

Indeed!

I had thought more than twice to also export it to the R level
notably as we have been using two R level versions in a package
I maintain ('copula'). They are vectorized there in a way that
seemed particularly useful to our (Marius Hofert and my) use cases.

More on this -- making these available in R, how exactly? --
probably should move to the R-devel list.

Thank you Bill for bringing it up!
Martin

> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com

> On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner  
wrote:

>> On 07/11/16 13:07, William Dunlap wrote:
>> 
>>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>>> 
>> 
>> Uh, no.  I don't think that that makes any sense in my context.
>> 
>> The "b" values are probabilities and must satisfy a "sum-to-1"
>> constraint.  To accommodate this constraint I re-parametrise via a
>> "logistic" style parametrisation --- basically
>> 
>> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>> 
>> with the parameters that the optimiser works with being z_1, ..., z_{n-1}
>> (and with z_n == 0 for identifiability).  The objective function is of 
the
>> form sum_i(a_i * log(b_i)), so I transform back
>> from the z_i to the b_i in order calculate the value of the objective
>> function.  But when the z_i get moderately large-negative, the b_i become
>> numerically 0 and then log(b_i) becomes -Inf.  And the optimiser falls 
over.
>> 
>> cheers,
>> 
>> Rolf
>> 
>> 
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com 
>>> 
>>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner >> > wrote:
>>> 
>>> 
>>> I am trying to deal with a maximisation problem in which it is
>>> possible for the objective function to (quite legitimately) return
>>> the value -Inf, which causes the numerical optimisers that I have
>>> tried to fall over.
>>> 
>>> The -Inf values arise from expressions of the form "a * log(b)",
>>> with b = 0.  Under the *starting* values of the parameters, a must
>>> equal equal 0 whenever b = 0, so we can legitimately say that a *
>>> log(b) = 0 in these circumstances.  However as the maximisation
>>> algorithm searches over parameters it is possible for b to take the
>>> value 0 for values of
>>> a that are strictly positive.  (The values of "a" do not change during
>>> this search, although they *do* change between "successive searches".)
>>> 
>>> Clearly if one is *maximising* the objective then -Inf is not a value
>>> of
>>> particular interest, and we should be able to "move away".  But the
>>> optimising function just stops.
>>> 
>>> It is also clear that "moving away" is not a simple task; you can't
>>> estimate a gradient or Hessian at a point where the function value
>>> is -Inf.
>>> 
>>> Can anyone suggest a way out of this dilemma, perhaps an optimiser
>>> that is equipped to cope with -Inf values in some sneaky way?
>>> 
>>> Various ad hoc kludges spring to mind, but they all seem to be
>>> fraught with peril.
>>> 
>>> I have tried changing the value returned by the objective function
>>> from
>>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite.
>>> However this seemed to flatten out the objective surface too much,
>>> and the search stalled at the 0 value, which is the antithesis of
>>> optimal.
>>> 
>>> The problem arises in a context of applying the EM algorithm where
>>> the M-step cannot be carried out explicitly, whence numerical
>>> optimisation.
>>> I can give more detail if anyone thinks that it could be relevant.
>>> 
>>> I would appreciate advice from younger and wiser heads! :-)
>>> 
>>> cheers,
>>> 
>>> Rolf Turner
>>> 
>>> --
>>> Technical Editor ANZJS
>>> Department of Statistics
>>> University of Auckland
>>> Phone: +64-9-373-7599 ext. 88276  
>>> 
>>> __
>>> R-help@r-project.org  mailing list --
>>> To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> 
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> 
>>> and provide commented, minimal, 

Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread William Dunlap via R-help
Perhaps the C function Rf_logspace_sum(double *x, int n) would help in
computing log(b).  It computes log(sum(exp(x_i))) for i in 1..n, avoiding
unnecessary under- and overflow.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner  wrote:

> On 07/11/16 13:07, William Dunlap wrote:
>
>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>>
>
> Uh, no.  I don't think that that makes any sense in my context.
>
> The "b" values are probabilities and must satisfy a "sum-to-1"
> constraint.  To accommodate this constraint I re-parametrise via a
> "logistic" style parametrisation --- basically
>
>b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>
> with the parameters that the optimiser works with being z_1, ..., z_{n-1}
> (and with z_n == 0 for identifiability).  The objective function is of the
> form sum_i(a_i * log(b_i)), so I transform back
> from the z_i to the b_i in order calculate the value of the objective
> function.  But when the z_i get moderately large-negative, the b_i become
> numerically 0 and then log(b_i) becomes -Inf.  And the optimiser falls over.
>
> cheers,
>
> Rolf
>
>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com 
>>
>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner > > wrote:
>>
>>
>> I am trying to deal with a maximisation problem in which it is
>> possible for the objective function to (quite legitimately) return
>> the value -Inf, which causes the numerical optimisers that I have
>> tried to fall over.
>>
>> The -Inf values arise from expressions of the form "a * log(b)",
>> with b = 0.  Under the *starting* values of the parameters, a must
>> equal equal 0 whenever b = 0, so we can legitimately say that a *
>> log(b) = 0 in these circumstances.  However as the maximisation
>> algorithm searches over parameters it is possible for b to take the
>> value 0 for values of
>> a that are strictly positive.  (The values of "a" do not change during
>> this search, although they *do* change between "successive searches".)
>>
>> Clearly if one is *maximising* the objective then -Inf is not a value
>> of
>> particular interest, and we should be able to "move away".  But the
>> optimising function just stops.
>>
>> It is also clear that "moving away" is not a simple task; you can't
>> estimate a gradient or Hessian at a point where the function value
>> is -Inf.
>>
>> Can anyone suggest a way out of this dilemma, perhaps an optimiser
>> that is equipped to cope with -Inf values in some sneaky way?
>>
>> Various ad hoc kludges spring to mind, but they all seem to be
>> fraught with peril.
>>
>> I have tried changing the value returned by the objective function
>> from
>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite.
>> However this seemed to flatten out the objective surface too much,
>> and the search stalled at the 0 value, which is the antithesis of
>> optimal.
>>
>> The problem arises in a context of applying the EM algorithm where
>> the M-step cannot be carried out explicitly, whence numerical
>> optimisation.
>> I can give more detail if anyone thinks that it could be relevant.
>>
>> I would appreciate advice from younger and wiser heads! :-)
>>
>> cheers,
>>
>> Rolf Turner
>>
>> --
>> Technical Editor ANZJS
>> Department of Statistics
>> University of Auckland
>> Phone: +64-9-373-7599 ext. 88276 > 088276>
>>
>> __
>> R-help@r-project.org  mailing list --
>> To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> 
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread Rolf Turner

On 07/11/16 15:46, Charles C. Berry wrote:

On Mon, 7 Nov 2016, Rolf Turner wrote:


On 07/11/16 13:07, William Dunlap wrote:

Have you tried reparameterizing, using logb (=log(b)) instead of b?


Uh, no.  I don't think that that makes any sense in my context.

The "b" values are probabilities and must satisfy a "sum-to-1"
constraint. To accommodate this constraint I re-parametrise via a
"logistic" style parametrisation --- basically

  b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n

with the parameters that the optimiser works with being z_1, ...,
z_{n-1} (and with z_n == 0 for identifiability).  The objective
function is of the form sum_i(a_i * log(b_i)),



This is sum_i(a_i * z_i) - sum(a_i)*log(sum_j(exp(z_j)), isn't it?

So you don't need to evaluate b_i here, do you?

Large values of z_j will lead to exp(z_j) == Inf, but using

sum_i(a_i * (z_i-max.z)) - sum(a_i)*log(sum_j(exp(z_j-max.z))

will handle that.


Wow!!!  That looks like it will work!!!  I won't completely believe it 
until I've programmed it up and tried it --- but for the first time in 
days I'm feeling hopeful.


HTH,

Chuck

p.s. Regarding "advice from younger and wiser heads", I probably cannot
claim to be either.


On present evidence you certainly appear to be one hell of a lot wiser!!!

Thanks.

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread Charles C. Berry

On Mon, 7 Nov 2016, Rolf Turner wrote:


On 07/11/16 13:07, William Dunlap wrote:

Have you tried reparameterizing, using logb (=log(b)) instead of b?


Uh, no.  I don't think that that makes any sense in my context.

The "b" values are probabilities and must satisfy a "sum-to-1" constraint. 
To accommodate this constraint I re-parametrise via a "logistic" style 
parametrisation --- basically


  b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n

with the parameters that the optimiser works with being z_1, ..., z_{n-1} 
(and with z_n == 0 for identifiability).  The objective function is of the 
form sum_i(a_i * log(b_i)),



This is sum_i(a_i * z_i) - sum(a_i)*log(sum_j(exp(z_j)), isn't it?

So you don't need to evaluate b_i here, do you?

Large values of z_j will lead to exp(z_j) == Inf, but using

sum_i(a_i * (z_i-max.z)) - sum(a_i)*log(sum_j(exp(z_j-max.z))

will handle that.

HTH,

Chuck

p.s. Regarding "advice from younger and wiser heads", I probably cannot 
claim to be either.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread Rolf Turner


On 07/11/16 14:14, ProfJCNash wrote:


Rolf, What optimizers did you try? There are a few in the optimrx package on 
R-forge that handle bounds, and it may be
useful to set bounds in this case. Transformations using log or exp can be 
helpful if done carefully, but as you note,
they can make the function more difficult to optimize.


I can't see how to impose bounds in my circumstances.  As you will have 
seen from my previous answer to Bill Dunlap's post, the b_i are 
probabilities, parametrised in terms of z_i:


b_i = exp(z_i)/[sum_j exp(z_j)] .

It is not at all clear to me how to impose constraints on the z_i that 
will bound the b_i away from 0.


I can constrain L <= z_i <= U for all i and get b_i >= exp(L)/[n*exp(U)]
--- I think; I may have things upsidedown and backwards --- but this 
leaves an infinitude of choices for L and U.


Also the starting values at each M-step are "naturally" given in terms 
of the b_i.  I.e. I can calculate "reasonable" values for the b_i and 
then transform these to provide starting values for the z_i.  The 
starting values for z_i might not satisfy a given set of constraints.
I guess I could simply truncate the starting values to fall within the 
constraints, but that "feels wrong" to me.


I also worry about the impact that imposing constraints will have on
the monotonicity of the successive values of the expected log likelihood 
in the EM algorithm context.



Be cautious about using the default numerical gradient approximations. optimrx 
allows selection of the numDeriv grad()
function, which is quite good. Complex step would be better, but you need a 
function which can be computed with complex
arguments. Unfortunately, numerical gradients often step over the cliff edge of 
computability of the function. The
bounds are not checked for the step to compute things like (f(x+h) - f(x) / h.


I think I can program up an analytic gradient function.  Maybe I'll try 
that.  I have been reluctant to do so because I have had peculiar (bad) 
experiences in the past in trying to use analytic gradients with nlm().


Of course the (analytic) gradient becomes undefined if one of the b_i is 0.

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread Rolf Turner

On 07/11/16 13:07, William Dunlap wrote:

Have you tried reparameterizing, using logb (=log(b)) instead of b?


Uh, no.  I don't think that that makes any sense in my context.

The "b" values are probabilities and must satisfy a "sum-to-1" 
constraint.  To accommodate this constraint I re-parametrise via a 
"logistic" style parametrisation --- basically


   b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n

with the parameters that the optimiser works with being z_1, ..., 
z_{n-1} (and with z_n == 0 for identifiability).  The objective function 
is of the form sum_i(a_i * log(b_i)), so I transform back

from the z_i to the b_i in order calculate the value of the objective
function.  But when the z_i get moderately large-negative, the b_i 
become numerically 0 and then log(b_i) becomes -Inf.  And the optimiser 
falls over.


cheers,

Rolf



Bill Dunlap
TIBCO Software
wdunlap tibco.com 

On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner > wrote:


I am trying to deal with a maximisation problem in which it is
possible for the objective function to (quite legitimately) return
the value -Inf, which causes the numerical optimisers that I have
tried to fall over.

The -Inf values arise from expressions of the form "a * log(b)",
with b = 0.  Under the *starting* values of the parameters, a must
equal equal 0 whenever b = 0, so we can legitimately say that a *
log(b) = 0 in these circumstances.  However as the maximisation
algorithm searches over parameters it is possible for b to take the
value 0 for values of
a that are strictly positive.  (The values of "a" do not change during
this search, although they *do* change between "successive searches".)

Clearly if one is *maximising* the objective then -Inf is not a value of
particular interest, and we should be able to "move away".  But the
optimising function just stops.

It is also clear that "moving away" is not a simple task; you can't
estimate a gradient or Hessian at a point where the function value
is -Inf.

Can anyone suggest a way out of this dilemma, perhaps an optimiser
that is equipped to cope with -Inf values in some sneaky way?

Various ad hoc kludges spring to mind, but they all seem to be
fraught with peril.

I have tried changing the value returned by the objective function from
"v" to exp(v) --- which maps -Inf to 0, which is nice and finite.
However this seemed to flatten out the objective surface too much,
and the search stalled at the 0 value, which is the antithesis of
optimal.

The problem arises in a context of applying the EM algorithm where
the M-step cannot be carried out explicitly, whence numerical
optimisation.
I can give more detail if anyone thinks that it could be relevant.

I would appreciate advice from younger and wiser heads! :-)

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276 

__
R-help@r-project.org  mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread ProfJCNash
Rolf, What optimizers did you try? There are a few in the optimrx package on 
R-forge that handle bounds, and it may be
useful to set bounds in this case. Transformations using log or exp can be 
helpful if done carefully, but as you note,
they can make the function more difficult to optimize.

Be cautious about using the default numerical gradient approximations. optimrx 
allows selection of the numDeriv grad()
function, which is quite good. Complex step would be better, but you need a 
function which can be computed with complex
arguments. Unfortunately, numerical gradients often step over the cliff edge of 
computability of the function. The
bounds are not checked for the step to compute things like (f(x+h) - f(x) / h.

Cheers, JN

On 16-11-06 07:07 PM, William Dunlap via R-help wrote:
> Have you tried reparameterizing, using logb (=log(b)) instead of b?
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner  wrote:
> 
>>
>> I am trying to deal with a maximisation problem in which it is possible
>> for the objective function to (quite legitimately) return the value -Inf,
>> which causes the numerical optimisers that I have tried to fall over.
>>
>> The -Inf values arise from expressions of the form "a * log(b)", with b =
>> 0.  Under the *starting* values of the parameters, a must equal equal 0
>> whenever b = 0, so we can legitimately say that a * log(b) = 0 in these
>> circumstances.  However as the maximisation algorithm searches over
>> parameters it is possible for b to take the value 0 for values of
>> a that are strictly positive.  (The values of "a" do not change during
>> this search, although they *do* change between "successive searches".)
>>
>> Clearly if one is *maximising* the objective then -Inf is not a value of
>> particular interest, and we should be able to "move away".  But the
>> optimising function just stops.
>>
>> It is also clear that "moving away" is not a simple task; you can't
>> estimate a gradient or Hessian at a point where the function value is -Inf.
>>
>> Can anyone suggest a way out of this dilemma, perhaps an optimiser that is
>> equipped to cope with -Inf values in some sneaky way?
>>
>> Various ad hoc kludges spring to mind, but they all seem to be fraught
>> with peril.
>>
>> I have tried changing the value returned by the objective function from
>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. However
>> this seemed to flatten out the objective surface too much, and the search
>> stalled at the 0 value, which is the antithesis of optimal.
>>
>> The problem arises in a context of applying the EM algorithm where the
>> M-step cannot be carried out explicitly, whence numerical optimisation.
>> I can give more detail if anyone thinks that it could be relevant.
>>
>> I would appreciate advice from younger and wiser heads! :-)
>>
>> cheers,
>>
>> Rolf Turner
>>
>> --
>> Technical Editor ANZJS
>> Department of Statistics
>> University of Auckland
>> Phone: +64-9-373-7599 ext. 88276
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread William Dunlap via R-help
Have you tried reparameterizing, using logb (=log(b)) instead of b?

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner  wrote:

>
> I am trying to deal with a maximisation problem in which it is possible
> for the objective function to (quite legitimately) return the value -Inf,
> which causes the numerical optimisers that I have tried to fall over.
>
> The -Inf values arise from expressions of the form "a * log(b)", with b =
> 0.  Under the *starting* values of the parameters, a must equal equal 0
> whenever b = 0, so we can legitimately say that a * log(b) = 0 in these
> circumstances.  However as the maximisation algorithm searches over
> parameters it is possible for b to take the value 0 for values of
> a that are strictly positive.  (The values of "a" do not change during
> this search, although they *do* change between "successive searches".)
>
> Clearly if one is *maximising* the objective then -Inf is not a value of
> particular interest, and we should be able to "move away".  But the
> optimising function just stops.
>
> It is also clear that "moving away" is not a simple task; you can't
> estimate a gradient or Hessian at a point where the function value is -Inf.
>
> Can anyone suggest a way out of this dilemma, perhaps an optimiser that is
> equipped to cope with -Inf values in some sneaky way?
>
> Various ad hoc kludges spring to mind, but they all seem to be fraught
> with peril.
>
> I have tried changing the value returned by the objective function from
> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. However
> this seemed to flatten out the objective surface too much, and the search
> stalled at the 0 value, which is the antithesis of optimal.
>
> The problem arises in a context of applying the EM algorithm where the
> M-step cannot be carried out explicitly, whence numerical optimisation.
> I can give more detail if anyone thinks that it could be relevant.
>
> I would appreciate advice from younger and wiser heads! :-)
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dealing with -Inf in a maximisation problem.

2016-11-06 Thread Rolf Turner


I am trying to deal with a maximisation problem in which it is possible 
for the objective function to (quite legitimately) return the value 
-Inf, which causes the numerical optimisers that I have tried to fall over.


The -Inf values arise from expressions of the form "a * log(b)", with b 
= 0.  Under the *starting* values of the parameters, a must equal equal 
0 whenever b = 0, so we can legitimately say that a * log(b) = 0 in 
these circumstances.  However as the maximisation algorithm searches 
over parameters it is possible for b to take the value 0 for values of

a that are strictly positive.  (The values of "a" do not change during
this search, although they *do* change between "successive searches".)

Clearly if one is *maximising* the objective then -Inf is not a value of
particular interest, and we should be able to "move away".  But the 
optimising function just stops.


It is also clear that "moving away" is not a simple task; you can't 
estimate a gradient or Hessian at a point where the function value is -Inf.


Can anyone suggest a way out of this dilemma, perhaps an optimiser that 
is equipped to cope with -Inf values in some sneaky way?


Various ad hoc kludges spring to mind, but they all seem to be fraught 
with peril.


I have tried changing the value returned by the objective function from
"v" to exp(v) --- which maps -Inf to 0, which is nice and finite. 
However this seemed to flatten out the objective surface too much, and 
the search stalled at the 0 value, which is the antithesis of optimal.


The problem arises in a context of applying the EM algorithm where the 
M-step cannot be carried out explicitly, whence numerical optimisation.

I can give more detail if anyone thinks that it could be relevant.

I would appreciate advice from younger and wiser heads! :-)

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.