[sympy] Fractions of sympy.Product of sympy.tensor.Indexed variables seem to not cancel correctly.

Jonathan Crall Tue, 09 Feb 2016 12:45:36 -0800

I'm having an issue where an expresion with sympy.tensor.Indexed variables 
does not seem to simplify correctly. 
It is likely that I'm just doing something incorrect, so I was wondering if 
anyone could help me figure this out.


I'm using sympy to just generate some simple equations based on Bayes rule. 

N is an event and I'm given a set of observations \set{X} = \{\ldots d_i 
\ldots }\. 

I'm using an indexed base to represent the set of observations as an array. 
I'm then using sympy.Product to multiply the probability of these 
observations together
(I'm assuming independence), so I create an Idx variable ``i`` and several 
sets of varaiables
that are indexed by ``i``. However, at the end of this script. It looks 
like ``P(di)[i]`` should be canceled out by a
simplification step, but it is not. 

Here is the script: 



import sympy
from sympy.tensor import IndexedBase, Idx  # NOQA
from sympy import tensor
from sympy import *  # NOQA
cardX = sympy.symbols('|X|', integer=True, positive=True, finite=True)
start, stop = 1, cardX
i = Idx(sympy.symbols('i', integer=True, finite=True))


def psym(expr):
    s = symbols(expr, real=True, finite=True, negative=False)
    return s


def IdxBase(expr):
    #s = tensor.IndexedBase(expr, shape=(cardX,))[i]
    s = tensor.IndexedBase(expr, shape=(1,))[i]
    return s


if 1:
    def Prod(s):
        return sympy.Product(s, (i, start, stop))
else:
    def Prod(s):
        return sympy.prod([s.subs(i, i_) for i_ in range(1, 4)])

P_N          = psym('P(N)')
P_X          = psym('P(X)')
P_X_given_N  = psym('P(X|N)')
P_N_given_X  = psym('P(N|X)')
P_di         = IdxBase('P(di)')
P_N_given_di = IdxBase('P(N|di)')
P_di_given_N = IdxBase('P(di|N)')

pprint = sympy.pretty_print
print('''
-----------------------
OUTPUT OF SVM: P(N | di)
-----------------------
''')
P_N_given_di_ = (P_di_given_N * P_N) / P_di
pprint(Eq(P_N_given_di, P_N_given_di_))

print('''
-----------------------
REARANGE USING BAYES P(di | N)
-----------------------
''')
P_di_given_N_ = (P_N_given_di * P_di) / P_N
pprint(Eq(P_di_given_N, P_di_given_N_))

print('''
-----------------------
AGGREGATE USING INDEPENDENCE
-----------------------
''')
prod_P_di_given_N  = Prod(P_di_given_N)
prod_P_di_given_N_ = Prod(P_di_given_N_)
P_X_given_N__ = prod_P_di_given_N
P_X_given_N_ = prod_P_di_given_N_
pprint(Eq(P_X_given_N, P_X_given_N__))
pprint(Eq(P_X_given_N, P_X_given_N_))

print('''
   === ALSO ===
      ''')
prod_P_di = Prod(P_di)
P_X_      = prod_P_di
pprint(Eq(P_X, P_X_))

print('''
-----------------------
REARANGE TO LIKELIHOOD USING BAYES AGAIN
-----------------------
''')
P_N_given_X__ = (P_X_given_N * P_N) / (P_X)
P_N_given_X_ = (P_X_given_N_ * P_N) / (P_X_)
pprint(Eq(P_N_given_X, P_N_given_X__))
print('---')
pprint(Eq(P_N_given_X, P_N_given_X_))
print('--- simplify --- ')
P_N_given_X_done = P_N_given_X_.doit(deep=True)
pprint(Eq(P_N_given_X, P_N_given_X_done))

# Does not seem to cancel out the P(di)[i] variable
#pprint(Eq(P_N_given_X, sympy.simplify(P_N_given_X_done)))



The output of this script is: 

-----------------------
OUTPUT OF SVM: P(N | di)
-----------------------

             P(N)⋅P(di|N)[i]
P(N|di)[i] = ───────────────
                 P(di)[i]   

-----------------------
REARANGE USING BAYES P(di | N)
-----------------------

             P(N|di)[i]⋅P(di)[i]
P(di|N)[i] = ───────────────────
                     P(N)       

-----------------------
AGGREGATE USING INDEPENDENCE
-----------------------

          |X|            
         ┬───┬           
P(X|N) = │   │ P(di|N)[i]
         │   │           
         i = 1           
           |X|                       
         ┬──────┬                    
         │      │ P(N|di)[i]⋅P(di)[i]
P(X|N) = │      │ ───────────────────
         │      │         P(N)       
         │      │                    
          i = 1                      

   === ALSO ===
      
        |X|          
       ┬───┬         
P(X) = │   │ P(di)[i]
       │   │         
       i = 1         

-----------------------
REARANGE TO LIKELIHOOD USING BAYES AGAIN
-----------------------

         P(N)⋅P(X|N)
P(N|X) = ───────────
             P(X)   
---
                |X|                       
              ┬──────┬                    
              │      │ P(N|di)[i]⋅P(di)[i]
         P(N)⋅│      │ ───────────────────
              │      │         P(N)       
              │      │                    
               i = 1                      
P(N|X) = ─────────────────────────────────
                    |X|                   
                   ┬───┬                  
                   │   │ P(di)[i]         
                   │   │                  
                   i = 1                  
--- simplify --- 
                        |X|                     
                  -|X| ┬───┬                    
         P(N)⋅P(N)    ⋅│   │ P(N|di)[i]⋅P(di)[i]
                       │   │                    
                       i = 1                    
P(N|X) = ───────────────────────────────────────
                       |X|                      
                      ┬───┬                     
                      │   │ P(di)[i]            
                      │   │                     
                      i = 1                     

There are no additions in this formula, so the denominator should 
completely cancel. 

Any ideas why the bottom term is not canceled by the top term? 

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/e7d3ea7d-831e-4236-bd44-a81e22f5468a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[sympy] Fractions of sympy.Product of sympy.tensor.Indexed variables seem to not cancel correctly.

Reply via email to