Re: [sage-devel] Bug in abs(I*x).diff(x)

Ondřej Čertík Fri, 05 Dec 2014 13:33:13 -0800

On Fri, Dec 5, 2014 at 1:20 PM, Ondřej Čertík <ondrej.cer...@gmail.com> wrote:
> Hi Bill,
>
> I thought about this a lot (essentially I studied complex analysis
> from several books as well as consulted with many colleagues) and I
> figured out some answers to my questions.
>
> In the approach (A), you have:
>
> log(a*b) = log(a) + log(b)
>
> What that means is that log() is multivalued, so you can add 2*pi*i*n
> for all "n". The way to do arithmetic and compare multivalued
> functions is simply to make sure that the infinite (sometimes it could
> be finite) set of values on the left is equivalent to the infinite set
> of values on the right. In other words, if you pick a value on the
> left, for the sake of an argument let's say a=b=-1 and we pick n = 5,
> so we get log(a*b) = log(1) = 0 + 2*pi*i*5 = 10*pi*i, then if you can
> find combinations of values on the right hand side to make the result
> equal to 10*pi*i, and you can do this for all integer "n", and if you
> can do the opposite, i.e. that you pick any combination of values on
> the right hand side and are able to find a value on the left hand side
> that is equal to it, then you prove the equality. I.e. you prove that
> the infinite set of multivalues on the left hand side and right hand
> side are equal.
>
> Once we have an understanding how log(z) works, we simply can derive
> all kinds of formulas in the approach (A). The way it works is that
> you put in the 2*pi*n factors, i.e. you explicitly enumerate all
> possibilities, then you derive some formulas, and at the end you
> absorb the 2*pi*n factors into the multivalued functions, i.e. you can
> always absorb 2*pi*i*n into log(). But sometimes it might not be
> possible to completely absorb all these factors.
>
> Now let's apply this to the problems below:
>
> On Wed, Nov 26, 2014 at 10:27 PM, Bill Page <bill.p...@newsynthesis.org> 
> wrote:
>> On 26 November 2014 at 12:58, Ondřej Čertík <ondrej.cer...@gmail.com> wrote:
>>> On Wed, Nov 26, 2014 at 10:17 AM, Bill Page <bill.p...@newsynthesis.org> 
>>> wrote:
>>>>
>>>> Does it help if a say the operations are defined "symbolically"?
>>>
>>> All I want is if you can give me an algorithm of your approach
>>> in sufficient detail, so that it can be implemented by me on a
>>> computer.  And by "your approach", I mean an approach, where
>>> conjugate(log(x)) = log(conjugate(x)) for all x.
>>>
>>
>> I am sorry, we seem to be having some trouble communicating. Is that
>> something infecting this email list? :)
>>
>> Making  "conjugate(log(x)) = log(conjugate(x)) for all x" is trivial
>> so long as it is treated symbolically: the 'conjugate' operation is
>> just defined to rewrite itself (auto-simplify) when applied to any
>> operand of the form log(_), so 'conjugate(log(_))' is evaluated as
>> 'log(conjugate(_))', where _ stands for any element of the domain
>> Expression.  This is what I meant when I said it was considered true
>> by definition, i.e. by definition of the symbolic 'conjugate'
>> operation.  Exactly the same sort of thing happens when the
>> 'conjugate' operation acts on 'conjugate'  so that
>> 'conjugate(conjugate(x))' is simply rewritten as 'x'.
>
> Sure, on this level you can implement it. I was thinking on a deeper
> level, i.e. imagining putting a number x=-1 in and see how could this be true:
>
> conjugate(log(-1)) = log(conjugate(-1))
>
> The answer that I was looking for is this:
>
> LHS: conjugate(log(-1)) = conjugate(i*pi + 2*pi*i*n) = -i*pi-2*pi*i*n
> RHS: log(conjugate(-1)) = log(-1) = i*pi + 2*pi*i*m
>
> If we pick n=-m-1, we always get LHS=RHS, so the two infinite set of
> multivalues are equivalent, and the relation conjugate(log(-1)) =
> log(conjugate(-1)) holds.
> When you evaluate log(-1), you cannot just give i*pi, you need to give
> all the multivalues. But otherwise it works.
>
>>
>>> I have provided all the details of the algorithm (B). In approach (B),
>>> it is not true that
>>> conjugate(log(x)) = log(conjugate(x)) for all x.
>>>
>>> This equation (when conjugate(log(x)) = log(conjugate(x)) holds)
>>> started this whole discussion.
>>
>> That
>>
>>   log(a*b) = log(a) + log(b)
>>
>> is considerably less trivial that the case of 'conjugate'.  From my
>> point of view that is what actually started this branch of the
>> "fabric" of this discussion.  That is where 'normalize' comes in.
>
> I think the above answers both, it all works and is consistent in the
> approach (A). You just need to remember that if a function is
> multivalued, e.g. log(z), then you always need to enumerate all the
> values and prove that the LHS is equivalent to RHS.
>
> There is a theorem, that says that actually, if you give me a complex
> function values on just one branch, I can reconstruct the function in
> all branches. So it is probably the case that you only need to find
> one set of "n", "m" and "k" to satisfy the equation and it will then
> hold for the other values as well. But for clarity, I always prove it
> for all values.
>
>>
>>> So I was trying to understand your approach how to make this hold
>>> for all "x", and I suggested various ways how maybe it could be
>>> implemented, and to most of it you said "that's not how FriCAS does
>>> it". At this point I don't have any more ideas how it could be done,
>>> so I don't know how to implement your approach. Which is sad --
>>> even though I am not advocating for your approach, I wanted to
>>> really understand it, so that I can make my own opinion on the pros
>>> and cons.
>>
>> Thank you for attempting to understand.
>>
>> I think I only used the phrase "that's not how FriCAS does it" in the
>> context of multi-valued functions.  My point is that FriCAS makes no
>> attempt to evaluate a multi-valued function symbolically. But FriCAS
>> does rewrite expressions involving mutli-valued functions in some
>> cases automatically and in others when asked to do so by operators
>> like 'normalize'.
>
> Yes, log(a*b) can always be rewritten to log(a)+log(b) as long as
> everything is multivalued.
>
>>
>>>
>>>> Maybe we need to define exactly what operations we are talking about.
>>>
>>> Sure. Let's just stick to one example, let me just copy & paste it
>>> from my previous email:
>>>
>>>>>> from cmath import log
>>>>>> a = -1
>>>>>> b = -1
>>>>>> log(a*b)
>>> 0j
>>>>>> log(a)+log(b)
>>> 6.283185307179586j
>>>
>>>>>> def arg(x): return log(x).imag
>>> ...
>>>>>> from math import floor, pi
>>>>>> I = 1j
>>>>>> log(a)+log(b)+2*pi*I*floor((
>>> pi-arg(a)-arg(b))/(2*pi))
>>> 0j
>>>
>>> As you confirmed, even if you evaluate this in FriCAS, log(a*b) is not
>>> equal to log(a) + log(b), when a=b=-1.
>>
>> Yes, I showed that as expected this was not equal when 'log' is
>> evaluated in a numeric domain but I am talking about a domain
>> constructed by 'Expression' which is a "symbolic" domain.
>
> Right, so the point is that when you evaluate numerically, you *need*
> to implicitly add the 2*pi*i*n factor and only compare the infinite
> set of values.
> Then there is no issue.
>
>>
>>> However, you claim that "symbolically" it is true that log(a*b) =
>>> log(a) - log(b) for all "a" and "b" and you provided a FriCAS function
>>> "normalize" that does it,
>>
>> No not exactly.  I am sorry that I did not express myself more
>> clearly.  Actually if I evaluate
>>
>>   test ( log(a*b) = log(a)+log(b) )
>>
>> FriCAS returns 'false' since no automatic simplifications apply here
>> and these are obviously to different expressions.  What I showed was
>> that
>>
>>   normalize(log(a*b)-log(a)-log(b))
>>
>> returns 0.
>>
>>> but you said that for deeper understanding you would need to consult
>>> Waldek Hebisch. Can you explain the discrepancy/inconsistency?
>>
>> Well, um, what I tried to say was that for a deeper understanding of
>> 'normalize' we would have to either read the source code of
>> 'normalize' or talk with Waldek who as studied the source code more
>> carefully and throughly than I have. 'normalize' was written by Manuel
>> Bronstein.  There is no specific documentation except for that
>> contained in the source code:
>>
>> https://github.com/fricas/fricas/blob/master/src/algebra/efstruc.spad#L83
>>
>> and unfortunately Manuel Bronstein is dead. Bronstein did however
>> publish several books and numerous articles. In particular 'normalize'
>> is part of his implementation of the "Risch structure theorem".  E.g.
>> http://dl.acm.org/citation.cfm?id=74566  As I recall there was some
>> Google Summer of Code work on sympy related to this.
>
> I think we don't need to know how normalize works anymore, since
> obviously in the approach (A),
> log(a*b) = log(a) + log(b).
>
>> But Waldek has made a number if important recent changes to this package.
>>
>>>
>>> How exactly are the operations in log(a*b) = log(a) - log(b) defined,
>>> so that this equation holds, even though when you put in a=b=-1, you
>>> get a different number on the LHS and RHS, as confirmed by FriCAS?
>>>
>>
>> My admittedly primitive understanding of how 'normalize'  operates in
>> this case is that it is similar in principle to what one does to show
>> for example that '(a*b)/(a*c) - b/c = 0', i.e. by rewriting the
>> expression to a canonical equivalent  form (although of course this is
>> actually done by automatic simplifications in FriCAS).  It is my
>> intention to continue to work toward improving my understanding of
>> this part of FriCAS especially since Waldek has expressed doubts about
>> the soundness of introducing 'conjugate' into Expression in the
>> context of this function.
>
> I played with various formulas for multivalued functions and it's all
> consistent, and for example these definitely hold:
>
> log(a*b) = log(a)+log(b)
> conjugate(log(x)) = log(conjugate(x))
>
> But then I tried:
>
> (x^a)^b = ( e^(a*log(x)) )^b = e^(b*log(e^(a*log(x)))) =
> e^(b*(a*log(x) + 2*pi*i*n)) = e^(a*b*log(x) + b*2*pi*i*n) = x^(a*b) *
> e^(b*2*pi*i*n)
>
> I was just using the definitions and put the 2*pi*i factors in at
> appropriate places. As you can see, in this case, the 2*pi*i factor
> can't be absorbed. So this is ugly. But it works, i.e.
>
> sqrt(x^2) = (x^2)^(1/2) = x^(2*1/2) * e^(1/2 * 2*pi*i*n) = x *
> e^(pi*i*n) = x * (-1)^n
>
> We are still using the approach (A), so everything is multivalued. In
> this case, we have only 2 values, +x and -x, but it's still a
> multivalued function with both of these values holding at the same
> time.
>
> It should be now clear, that it is *not* true that (x^a)^b = x^(a*b),
> because then for a=2, b=1/2, you would get:
>
> sqrt(x^2) = x
>
> But the function on the left is multivalued (with values/branches +x
> and -x), while the function on the right is single valued with only
> one value "x". The only way you could make this work is if you say
> that it is possible to find a branch on the left (+x) that agrees with
> the single value on the right. But for a CAS, it would be a mistake to
> simplify sqrt(x^2) to x. It would be ok to simplify sqrt(x^2) to
> x*(-1)^n, but it's ugly, since now you have "n" in there.
>
> For this reason, the approach (A) is not very well suited for a CAS
> and I think approach (B) is much better. The approach (B) follows from
> (A) by simply choosing such "n", that picks the principal branch. So
> for example for sqrt(x^2), it picks n = floor((pi-2*arg(x)) / (2*pi)).
> As an added bonus, since numerical evaluations of log(z) and other
> functions also returns the principal branch, all the formulas are
> consistent and no need to worry about any 2*pi*n factors.


But there is one issue actually. We prove that

log(a*b) = log(a) + log(b)

in the sense that the set of multivalues on the LHS is equal to the
set of multivalues on the RHS. So we can write:

log(a*b) - log(a) + log(b) = 2*pi*i*n

I.e. the LHS = log(a*b) - log(a) + log(b) is a multivalued function,
it is not zero. The set of values is equal to RHS = 2*pi*i*n.
So if you write:

log(a*b) - log(a) + log(b) = 0

Then it only holds in a sense, that you can pick "n" or a branch on
the LHS such that it is equal to the RHS, i.e. 0. But nothing stops
you from picking a different branch, let's say n=5, i.e. LHS =
2*pi*i*5 = 10*pi*i, and that is most definitely not equal to 0. It is
the same as with the case sqrt(x^2) = x*(-1)^n above. You can write it
as sqrt(x^2) = x, but then it only holds in the sense that you can
always (i.e. for any 'x') pick a branch of sqrt(x^2) such that it is
equal to 'x'. But the problem is that this branch pick depends on "x"
or "a,b", i.e. for some values of "x" or "a,b" you have to pick one
branch, but for other values you have to pick a different branch. This
is obvious for sqrt(x^2) = x, i.e. for x=3, you need to pick the +x
branch, but for x=-3, you need to pick the -x branch. The same for
log(a*b) - log(a) + log(b) = 0, i.e. for a=b=1 you pick the branch
where log(1) = 0, but for a=b=-1, you have to pick log(-1) = i*pi, but
log(1) = 2*pi*i (those are two different branches), so that log(1) -
log(-1) - log(-1) = 0.

In other words, if you like that a CAS simplifies / "normalizes"

1) log(a*b) - log(a) + log(b) = 0

then you should also want the CAS to normalize:

2) sqrt(x^2) = x

Do you agree with me, based on the above analysis, that these two
cases 1) and 2) are exactly equivalent? I.e. in both you need to pick
a specific branch on the LHS to make it equal the RHS and the branch
pick depends on the values of "a,b" or "x".


Assuming you agree, the next step is to realize that the
simplification 2), i.e. sqrt(x^2) = x is especially problematic, since
every high schooler knows that for real numbers, we have sqrt(x^2) =
|x|, not sqrt(x^2) = x. Yes, you can make 2) work, and above I
described in detail how it works, but this is not what most people
use. Just google the internet for sqrt(x^2), i.e. here:

http://math.stackexchange.com/a/961795/30944

Everybody will tell you that sqrt(x^2) = -x for negative "x", i.e.
that sqrt(x^2) = |x|. I can't even imagine doing any kinds of
calculations with assuming sqrt(x^2) = x, that just quickly leads to
wrong answers so easily. In fact, the stackexchange question assumed
sqrt(x^2) = x and it lead to a wrong answer (yes, the poster should
have picked all the branches consistently if he wanted to use
sqrt(x^2) = x).

But let me know if you have any arguments, why we should even
entertain the cases 1) and 2), i.e. equating a multivalued function to
a single valued function (i.e. 0 or "x").

What I think can be made to work is to simply always equate
multivalued functions to multivalued ones, e.g.:

log(a*b) - log(a) + log(b) = 2*pi*i*n
sqrt(x^2) = x*(-1)^n

That works and the chances of mistakes are quite low. So we have three
approaches to complex analysis:

(A) multivalued approach, e.g.:

    sqrt(x^2) = x*(-1)^n
    log(a*b) = log(a) + log(b)
    log(a*b) - log(a) + log(b) = 2*pi*i*n
    conjugate(log(z)) = log(conjugate(z))
    conjugate(log(z)) - log(conjugate(z)) = 2*pi*i*n

(A') multivalued approach, when you are allowed to equate multivalued
functions to single valued by picking a specific branch (but not
always the same branch for all "x" or "a,b"), like

    sqrt(x^2) = x
    log(a*b) = log(a) + log(b)
    log(a*b) - log(a) + log(b) = 0
    conjugate(log(z)) - log(conjugate(z)) = 0

(B) single valued approach on a principal branch (the same branch for
all "x" or "a,b"), like

    sqrt(x^2) = x * (-1)^floor((pi-2*arg(x)) / (2*pi))
    log(a*b) = log(a) + log(b) + 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))
    log(a*b) - log(a) - log(b) = 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))
    conjugate(log(z)) = log(conjugate(z)) -2*pi*i*floor((arg(z)+pi)/(2*pi))
    conjugate(log(z)) - log(conjugate(z)) = -2*pi*i*floor((arg(z)+pi)/(2*pi))



These examples should very clearly clarify the differences between each.

You can clearly see, that (B) is just (A), where we pick a specific
"n" (depending on "x" or "a,b") that always picks the principal branch
(i.e. the branch pick is independent on "x", "a,b", it's always the
principal branch). All formulas in (B) thus have a specific "n" in
them, usually in terms of the floor() function. All formulas hold for
all "x", "a,b" as they are (no further branch picking is necessary,
one can directly evaluate them numerically).

In (A), some of the formulas look "nice", because the "n" dependence
is absorbed in the multivalued functions, but some other formulas have
an explicit "n" dependence. There is no way around it. All formulas
hold for all "x" or "a,b", but when evaluating numerically, one needs
to keep the "n" dependence in it and be able to treat them a
collection of values (multivalued).

Finally, approach (A') results from (A) by picking "n" which is
*independent* of "x" or "a,b", typically just n=0. From (B) it follows
that this approach inevitably makes the branch pick dependent on "x"
or "a,b", which means that for some "x" it picks one branch, but for
some other "x" it picks another branch. As such, some of these
formulas do *not* hold for all "x" or "a,b" with the same branch, but
rather depending on "x" or "a,b", one needs to pick the right branch
when evaluating numerically. Some other formulas are the same as in
(A), so those hold for all "x" or "a,b".


Ondrej

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To post to this group, send email to sage-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Re: [sage-devel] Bug in abs(I*x).diff(x)

Reply via email to