[ 
https://issues.apache.org/jira/browse/MAHOUT-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161884#comment-13161884
 ] 

Jake Mannix commented on MAHOUT-399:
------------------------------------

Ah, not sure what happened, but the current trunk LDA is now failing this test, 
while the new one is not.  Marking the old lda test with @Ignore("MAHOUT-399") 
to track it for now.
                
> LDA on Mahout 0.3 does not converge to correct solution for overlapping 
> pyramids toy problem.
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-399
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-399
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.3, 0.4, 0.5
>         Environment: Mac OS X 10.6.2, Hadoop 0.20.2, Mahout 0.3.
>            Reporter: Michael Lazarus
>            Assignee: Jake Mannix
>              Labels: lda, mahout
>             Fix For: 0.6
>
>         Attachments: 1000docs_26terms_5topics.jpg, MAHOUT-399.diff, 
> Overlapping Pyramids Toy Dataset.pdf, olt.tar
>
>
> Hello,
> Apologies if I have not labeled this correctly.
> I have run a toy problem on Mahout 0.3 (locally) for LDA that I used to test 
> Blei's c version of LDA that he posts on his site. It has an exact solution 
> that the LDA should converge to.  Please see attached PDF that describes the 
> intended output.
> Is LDA working?  The following output indicates some sort of collapsing 
> behavior to me.
> T0    T1      T2      T3      T4
> x     w       x       u       x
> u     u       g       j       n
> l     r       i       m       l
> j     q       h       h       p
> v     p       e       i       q
> e     t       f       g       v
> d     s       d       f       o
> b     c       b       n       k
> y     f       c       l       m
> w     v       u       v       u
> c     d       p       y       t
> k     o       l       r       r
> i     b       j       k       j
> f     e       k       e       f
> g     x       y       s       y
> t     y       w       b       w
> h     i       s       p       s
> o     l       v       x       d
> q     j       t       d       i
> n     k       o       t       b
> The intended output is (again, please see attached):
> D     I       N       S       X
> d     i       n       s       x
> c     h       m       t       y
> e     j       o       r       w
> b     k       l       u       v
> f     g       p       q       a
> a     f       k       p       b
> g     l       q       v       u
> h     m       j       w       t
> y     u       r       o       c
> n     s       d       d       i
> s     e       x       f       f
> r     q       i       i       n
> m     v       w       c       o
> o     w       u       a       h
> q     n       s       h       g
> p     t       c       x       d
> t     x       f       e       l
> x     d       e       j       s
> w     y       g       b       j
> i     r       y       n       r
> u     o       h       y       m
> k     b       t       l       e
> v     c       a       m       k
> j     a       b       g       p
> l     p       v       k       q
> What tests do you run to make sure the output is correct?
> Thank you,
> Mike.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to