On Sun, Jan 6, 2013 at 12:34 PM, wrote:
> I think that one of the Mahout algorithms (DF) does use NaN for
> "undecidable"
>
Yes. But I don't think the HMM codes do.
> So perhaps there is a long term need to think through the output
> semantics of the library?
>
Yes. And no.
Yes, it would b
On Sun, Jan 6, 2013 at 1:35 PM, wrote:
> Hi,
>
> I've been using the standalone trainer.
>
> I'll have a look at the log scaled trainers - thanks for the tip!
>
>
Log scaling is absolutely required. Otherwise, you start dealing with
numerical underflow amazingly quickly.
Hi,
I've been using the standalone trainer.
I'll have a look at the log scaled trainers - thanks for the tip!
Best,
Simon
Dr. Simon Thompson
Chief Researcher, Customer Experience.
BT Research.
BT plc. PP11J. MLBG BT Adastral Park, Martlesham Heath.
IP5 3RE
Note :
This email contains
Hi Simon,
Are you using the standalone HMM trainer or are you running with the MapReduce
variant using the patch available at
https://issues.apache.org/jira/browse/MAHOUT-627?
As Ted mentioned, these trainers can experience arithmetic underflow when the
set of states is large. Did you try the
Hi Ted,
thanks very much for the response, very helpful to hear these thoughts.
What I will do is look at the data set issue and report back as to what I find
out. I'll prod round the code and see if I can get a clue as to how it produces
infinities and so on.
I think that one of the Mahout
It sounds like you are getting some numerical stability issues with the
training program. With HMM's, the most common problem that leads to this
is numerical underflow. I haven't looked at this in detail, however, so I
can't comment very knowledgeably. It is possible that the current
implementat