Re: [open-axiom-devel] [fricas-devel] Re: Algorithmic differentiation

Mark Clements Wed, 22 Mar 2017 16:03:57 -0700

Bill,

Thank you for following up on this.


This is a nice illustration of how to extend pan-Axiom to deal with a
specific case. One of the advantages of algorithmic differentiation (AD)
is that we can rapidly differentiate across a range of operators and
expressions. One could re-write the previous example using an iterative
function:

)set message time on
g(x) ==
  expr := 1+x
  for i in 1..10000 repeat
    expr := expr*(1+x)
  expr
eval(D(g(x),x),x=0.1) -- very slow
g(jet(0.1)) -- fast
)set message time off

The symbolic solution takes about 30 seconds on my laptop, while the AD
solution was 500-1000 times faster.

I have extended the "Jet" package: to include some more operators; allow
for coersion from T to Jet(T); and defined the export category as
Join(PartialOrder,IntegralDomain), which now supports eval(). One can
now write:

reduce(+, vector [1,2,3*jet(3)])
-- and
eval(x^exp(x)^exp(x), x=jet(0.1))

-- Mark


On 03/20/2017 06:46 PM, Bill Page wrote:
> Oops, I forgot to add one line of output
>
> (3) -> distribute(%)::Float
>
>    (3)  0.7681727501_0912254214 E 418
>                                                                   Type: Float
>                                                                   Time: 0 sec
>
> 'distribute' is the operator that "undoes" 'paren'.
>
> On 20 March 2017 at 13:32, Bill Page <bill.p...@newsynthesis.org> wrote:
>> The problem is just the expansion of (1+x)^10000.
>>
>> (1) -> )set message time on
>>
>> (1) -> g(x) == (1+x)^10000
>>                                                                    Type: Void
>>                                                                   Time: 0 sec
>> (2) -> eval(D(g(x),x), x=0.1)
>>    Compiling function g with type Variable x -> Polynomial Integer
>>
>>                                                        Type: Polynomial Float
>>                            Time: 3.09 (IN) + 6.74 (EV) + 0.06 (OT) = 9.89 sec
>>
>> In FriCAS and OpenAxiom (probably Axiom too) the correct way to avoid
>> this is to treat (1+x) as a kernel via the identity operator 'paren'.
>>
>> (3) -> g(x) == paren(1+x)^10000
>>    Compiled code for g has been cleared.
>>    1 old definition(s) deleted for function or rule g
>>                                                                    Type: Void
>>                                                                   Time: 0 sec
>> (4) -> eval(D(g(x),x), x=0.1)
>>    Compiling function g with type Variable x -> Expression Integer
>>                      9999      ,
>>    (4)  10000.0 (1.1)    %paren (1.1)
>>
>>                                                        Type: Expression Float
>>                                        Time: 0.09 (IN) + 0.05 (OT) = 0.14 sec
>>
>> The only trouble here is that neither FriCAS nor OpenAxiom know how to
>> differentiate this operator.  This can be fixed with the following
>> patch:
>>
>> https://github.com/billpage/fricas/commit/04c1f9b2d5eb4ece95c6f5dded46a72181cecb06
>> https://github.com/billpage/fricas/commit/04c1f9b2d5eb4ece95c6f5dded46a72181cecb06.patch
>>
>> (1) -> g(x) == paren(1+x)^10000
>>                                                                    Type: Void
>>                                                                   Time: 0 sec
>> (2) -> eval(D(g(x),x), x=0.1)
>>    Compiling function g with type Variable(x) -> Expression(Integer)
>>
>>                      9999
>>    (2)  10000.0 (1.1)
>>                                                       Type: Expression(Float)
>>                            Time: 0.04 (IN) + 0.00 (EV) + 0.12 (OT) = 0.16 sec
>>

)abbrev domain JET Jet
Jet(T: Join(PartialOrder,Ring)): Public == Private where
  Public ==> Join(PartialOrder,IntegralDomain) with
    jet: (T, T) -> %
      ++ construct a jet value from a pair
    jet: (T) -> %
      ++ construct a jet value with delta=1
    elt: (%, "value") -> T
      ++ retrieve the value field of a jet
    elt: (%, "delta") -> T
     ++ retrieve the delta field of a jet
    setelt: (%, "value", T) -> T
      ++ set the value field of a jet
    setelt: (%, "delta", T) -> T
      ++ set the delta field of a jet
    coerce: T -> %
      ++ coerce from T to %
    Zero: () -> %
      ++ zero element
    One: () -> %
      ++ one element
    if T has PartialOrder then PartialOrder
    if T has Comparable then Comparable
    if T has OrderedRing then OrderedRing
    if T has RetractableTo Integer then RetractableTo Integer
    if T has RetractableTo Fraction Integer then RetractableTo Fraction Integer
    if T has Field then
      "exquo": (%,%) -> Union(%,"failed")
      "/": (%, %) -> %
      "/": (T, %) -> %
      "/": (%, T) -> %
      "^": (%, Integer) -> %
      "^": (%, Fraction Integer) -> %
    if T has TranscendentalFunctionCategory and T has Field and T has 
RadicalCategory then
      -- RadicalCategory and Field are *not* required for differentiating all 
of TranscendentalFunctionCategory;
      -- this is done for conciseness.
      TranscendentalFunctionCategory
      "^": (T, %) -> %
      "^": (%, T) -> %
    if T has RadicalCategory and T has Field then
      sqrt: % -> %
    if FloatLiouvilianFunctions has (erf:T->T) and T has FloatingPointSystem 
and T has TranscendentalFunctionCategory then
      erf: % -> %
        ++ Error function
      pnorm: (%) -> %
        ++ pnorm(x) is the CDF at x for a standard normal distribution with a 
mean 0 and variance 1
      pnorm: (%, %, %) -> %
        ++ pnorm(x,mu,sigma) is the CDF at x for a normal distribution with a 
mean mu and variance sigma^2
  Private ==> add
    x, y : %
    z : T
    n : Integer
    Rep := Record(val: T, der: T)
    jet(v,d) == [v, d]
    jet(v) == jet(v, 1$T) -- is this the right default?
    coerce(z) == jet(z, 0$T)
    coerce(n) == jet(n::T, 0$T)
    elt(jet: %, key: "value") == jet.val    
    elt(jet: %, key: "delta") == jet.der
    setelt(jet: %, key: "value", jetValue: T) == jet.val := jetValue
    setelt(jet: %, key: "delta", jetDelta: T) == jet.der := jetDelta
    -- if T has Ring then
    Zero() == jet(0$T, 0$T)
    One() == jet(1$T, 0$T)
    x + y == jet(x.value + y.value, x.delta + y.delta)
    x - y == jet(x.value - y.value, x.delta - y.delta)
    x * y == jet(x.value * y.value, x.delta*y.value + x.value*y.delta)
    n * y == jet(n * y.value, n*y.delta)
    -x == jet(-x.value, -x.delta)
    if T has Comparable then
      x = y == x.value = y.value and x.delta=y.delta
      hash x == hash(x.value)+hash(x.delta)
      coerce(x):OutputForm == 
        hconcat(('jet)::OutputForm, paren [ (x.value)::OutputForm, 
(x.delta)::OutputForm ])
      latex x == 
        concat ["jet(", latex(x.value), "," latex(x.delta), ")"]
      if T has PartialOrder then
        smaller?(x,y) == x.value<y.value or (x.value=y.value and 
x.delta<y.delta)
      else 
        smaller?(x,y) == false
    if T has PartialOrder then
      x <= y == x.value <= y.value or (x.value = y.value and x.delta <= y.delta)
    if T has OrderedRing then
      abs x == jet(abs(x.value), sign(x.value)*x.delta)
    if T has RetractableTo Integer then
      coerce(n:Integer):% == jet(n::T,0$T)
      retract(x):Integer == retract(x.value)@Integer
      retractIfCan(x):Union(Integer,"failed") == 
retractIfCan(x.value)@Union(Integer,"failed")
    if T has RetractableTo Fraction Integer then
      coerce(f:Fraction Integer):% == jet(f::T,0$T)
      retract(x):Fraction Integer == retract(x.value)@Fraction(Integer)
      retractIfCan(x):Union(Fraction Integer,"failed") == 
retractIfCan(x.value)@Union(Fraction Integer,"failed")
    if T has Field and T has RetractableTo Integer and T has RetractableTo 
Fraction Integer then
      x exquo y == 
        y.value=0 => "failed"
        x/y
      x / y == 
        r : T := x.value / y.value
        jet(r, (x.delta - r * y.delta) / y.value)
      x / z == 
        jet(x.value / z, x.delta / z)
      z / x == 
        jet(z / x.value, -z*x.delta / x.value / x.value)
      x ^ (n:Integer) == 
        jet(x.value ^ n, n*x.value^(n-1)*x.delta)
    if T has TranscendentalFunctionCategory and T has Field and T has 
RadicalCategory and T has RetractableTo Fraction Integer then
      x ^ (frac:Fraction Integer) == x^(frac::T)
    if T has TranscendentalFunctionCategory and T has Field and T has 
RadicalCategory then
      -- extras
      z ^ x == jet(z ^ x.value, log(z)*z^x.value*x.delta)
      x ^ z == jet(x.value ^ z, x.delta*z*x.value^(z-1))
      -- TranscendentalFunctionCategory
      x ^ y == jet(x.value ^ y.value, log(x.value)*x.value^y.value*y.delta + 
x.delta*y.value*x.value^(y.value-1))
      acos x == jet(acos(x.value), -x.delta/sqrt(1-x.value*x.value))
      acosh x == jet(acosh(x.value), x.delta/sqrt(x.value*x.value-1))
      acot x == jet(acot(x.value), -x.delta/(1+x.value*x.value))
      acoth x == jet(acoth(x.value), -x.delta/(x.value*x.value-1))
      acsc x == jet(acsc(x.value), -x.delta/(x.value*sqrt(x.value*x.value-1)))
      acsch x == jet(acsch(x.value), -x.delta/(x.value*sqrt(x.value*x.value+1)))
      asec x == jet(asec(x.value), x.delta/(x.value*sqrt(x.value*x.value-1)))
      asech x == jet(asech(x.value), -x.delta/(x.value*sqrt(1-x.value*x.value)))
      asin x == jet(asin(x.value), x.delta/sqrt(1-x.value*x.value))
      asinh x == jet(asinh(x.value), x.delta/sqrt(1+x.value*x.value))
      atan x == jet(atan(x.value), x.delta/(1+x.value*x.value))
      atanh x == jet(atanh(x.value), -x.delta/(x.value*x.value-1))
      cos x == jet(cos(x.value), -x.delta*sin(x.value))
      cosh x == jet(cosh(x.value), x.delta*sinh(x.value))
      cot x == jet(cot(x.value), -x.delta*(1+cot(x.value)*cot(x.value)))
      coth x == jet(coth(x.value), x.delta*(1-coth(x.value)*coth(x.value)))
      csc x == jet(csc(x.value), -x.delta*csc(x.value)*cot(x.value))
      csch x == jet(csch(x.value), -x.delta*csch(x.value)*coth(x.value))
      exp x == jet(exp(x.value), x.delta*exp(x.value))
      log x == jet(log(x.value), x.delta/x.value)
      pi() == jet(pi()$T,0$T)
      sec x == jet(sec(x.value), x.delta*sec(x.value)*tan(x.value))
      sech x == jet(sech(x.value), -x.delta*sech(x.value)*tanh(x.value))
      sin x == jet(sin(x.value), x.delta*cos(x.value))
      sinh x == jet(sinh(x.value), x.delta*cosh(x.value))
      tan x == jet(tan(x.value), x.delta*(1+tan(x.value)*tan(x.value)))
      tanh x == jet(tanh(x.value), x.delta*(1-tanh(x.value)*tanh(x.value)))
    if T has RadicalCategory and T has Field then
      sqrt x == jet(sqrt(x.value), x.delta/sqrt(x.value)/(1+1))
    if FloatLiouvilianFunctions has (erf:T->T) and T has FloatingPointSystem 
and T has TranscendentalFunctionCategory then
      import FloatLiouvilianFunctions
      import Pi
      erf x == jet(erf(x.value)$FloatLiouvilianFunctions, 
(1+1)*exp(-x.value*x.value)*x.delta/sqrt(pi()))
      pnorm x == (1 + erf(x/sqrt(1+1)))/(1+1)
      pnorm(x,mu,sigma) == pnorm((x-mu)/sigma)
)if false
)cd /home/marcle/Home/axiom
)co Jet2.spad
JF ==> Jet Float

-- constructors
jet(1.0,2.0)
jet(1.0)
0$Jet Float
latex 1$Jet DoubleFloat
0$Jet Float < 1$Jet Float
jet(1.0) exquo 0
jet(2.0,1.0)<=jet(2.0,1.0)
jet(1,2) <= jet(1,3)

-- some functions
cos(jet(1.0)*pi())
erf(jet(2.0))
2*jet(2)
log(jet(2.0))
pnorm(jet(1.96))
pnorm(1+2*jet(1.96),1,2)

-- coerce and use vectors
reduce(+, vector [1,2,3*jet(3)]) -- cool!
reduce(+, vector [1,2,3*jet(3.0)]) -- cool!
vector [1,2,3] :: Vector Jet Float -- ok

-- testing
testDeriv(f,val) == 
  fjet : Jet Float := f(jet(val))
  (f(val)-fjet.value, eval(D(f(x),x),x=val) - fjet.delta)
testDeriv(log,2.0)
testDeriv(exp,2.0)
testDeriv(-,2.0)
testDeriv(x +-> exp(exp(x)),2.0)
testDeriv(sin,2.0)
testDeriv(cos,2.0)
testDeriv(tan,2.0)
testDeriv(sinh,2.0)
testDeriv(cosh,2.0)
testDeriv(tanh,2.0)
testDeriv(asin,0.1)
testDeriv(acos,0.1)
testDeriv(atan,0.1)
testDeriv(x +-> x^(2*x),2.0)
testDeriv(x +-> x^(x*2),2.0)
testDeriv(x +-> x^3,3.0)
testDeriv(sqrt,2.0)
testDeriv(x +-> exp(x)/sqrt(x),2.0)
testDeriv(x +-> exp(x)*sqrt(x),2.0)
testDeriv(x +-> x*x,2.0)
testDeriv(x +-> x+x,2.0)
testDeriv(x +-> 10*x,2.0)

-- some more functions
)set message time on
g(x) == (1+x)^10000
gx := g(x); -- slow expansion
eval(D(gx,x), x=0.1) -- slow symbolic differentiation
eval(gx, x=jet(0.1)) -- also slow - but faster than D(gx,x) 
g(jet(0.1)) -- fast algorithmic differentiation!
)set message time off
-- g(jet(0.1::DoubleFloat)) -- FLOATING-POINT-OVERFLOW error
-- g(jet(x)$Jet(Expression Float)) -- very slow

)set message time on
g(x) ==
  expr := 1+x
  for i in 1..10000 repeat
    expr := expr*(1+x)
  expr
g(jet(0.1))
eval(D(g(x),x),x=0.1)
)set message time off


)set message time on
g(x) ==
  expr := 1
  for i in 1..1000 repeat
    expr := expr+x^i/factorial(i)
  expr
g(1.0)
g(jet(1.0)) - jet(1.0)*exp(1.0)
eval(D(g(x),x),x=1.0) - exp(1.0)
)set message time off

-- and more functions
g(x) == x^exp(x)^exp(x)
gx := g(x);
eval(D(gx,x),x=0.1)
g(jet(0.1))
g(jet(0.1::DoubleFloat))
eval(x^exp(x)^exp(x), x=jet(0.1))
--eval(x^erf(x)^erf(x), x=jet(0.1)) -- Expression Jet Float

-- calling an interpreter function
fx(x, n) ==
  if n=0 then return 0*x
  else
    a := 0*x
    b := x
    (a,b) := (0, x)
    for k in 1..(n-1) repeat
      h := b
      b := log(a+b)
      a := h
    b
eval(D(fx(x,7),x), x=2.0)
fx(jet(2.0,1.0), 7)

-- eval now works; eval() requires PartialOrder and IntegralDomain for 
Expression(Jet(T))?
beta = jet(1.0) -- ok
eval((beta+1)^2,beta=jet(1.0)) -- ok
subst((beta+1)^2,beta=jet(1.0)) -- ok
eval(exp(2*beta),beta=jet(1.0)) -- ok

-- Extended example: partial log-likelihood
-- Assume distinct ordered times
l(beta) == 
  event := [true,false,false,true,true,false]
  times := [i*1.0 for i in 1..#event]
  x := [1.0,0,1,1,0,1]
  eta := [beta*xi for xi in x]
  index := [i for i in 1..#event for e in event | e]
  -- L := reduce(*,[exp(eta(i))/reduce(+,[exp(eta(j)) for j in i..#event]) for 
i in index])
  reduce(+,[eta(i) - log(reduce(+,[exp(eta(j)) for j in i..#event])) for i in 
index])
l(jet(1.0))
eval(l(beta), beta=jet(1.0)) -- ok
-- time-varying effect
f(beta,gamma) == reduce(+,[x.i*beta+x.i*times.i*gamma - 
log(reduce(+,[exp(x.j*beta+x.j*times.j*gamma) for j in i..#event])) for i in 
index])
f(jet(1.0),0)
f(jet(1.0),3.0)

--
-- check some operators
g := operator 'g
h := operator 'h
Dgx(f) == (f,D(f(g(x)),x))
Dgx2(f) == (f,D(f(g(x),h(x)),x))
[Dgx(acos), Dgx(acosh), Dgx(acot), Dgx(acoth), Dgx(acsc), Dgx(acsch), 
Dgx(asec), Dgx(asech), Dgx(asin), Dgx(asinh), Dgx(atan), Dgx(atanh), Dgx(cos), 
Dgx(cosh), Dgx(cot), Dgx(coth), Dgx(csc), Dgx(csch), Dgx(exp), Dgx(log), 
Dgx(sec), Dgx(sech), Dgx(sin), Dgx(sinh), Dgx(tan), Dgx(tanh), Dgx(sqrt), 
Dgx(-), Dgx(h)]
)sh TranscendentalFunctionCategory

[Dgx(Chi), Dgx(Ci), Dgx(Ei), Dgx(Shi), Dgx(Si), Dgx(erf), Dgx(erfi), 
Dgx(fresnelC), Dgx(fresnelS), Dgx(li)]
)sh FloatLiouvilianFunctions
[Dgx2(Beta), Dgx(Gamma), Dgx(airyAi)]
)sh SPFCAT
Dgx(erf)

-- finite differences
fd(f,x) ==
  eps:=1.0e-10
  (f(x+eps) - f(x-eps))/2.0/eps
fd(log,1.1)
  
)endif

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
open-axiom-devel mailing list
open-axiom-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open-axiom-devel

Re: [open-axiom-devel] [fricas-devel] Re: Algorithmic differentiation

Reply via email to