Re: HMM - baum welch and hmmpredict

2013-01-06 Thread Ted Dunning
On Sun, Jan 6, 2013 at 12:34 PM, wrote: > I think that one of the Mahout algorithms (DF) does use NaN for > "undecidable" > Yes. But I don't think the HMM codes do. > So perhaps there is a long term need to think through the output > semantics of the library? > Yes. And no. Yes, it would b

Re: HMM - baum welch and hmmpredict

2013-01-06 Thread Ted Dunning
On Sun, Jan 6, 2013 at 1:35 PM, wrote: > Hi, > > I've been using the standalone trainer. > > I'll have a look at the log scaled trainers - thanks for the tip! > > Log scaling is absolutely required. Otherwise, you start dealing with numerical underflow amazingly quickly.

RE: HMM - baum welch and hmmpredict

2013-01-06 Thread simon.2.thompson
Hi, I've been using the standalone trainer. I'll have a look at the log scaled trainers - thanks for the tip! Best, Simon Dr. Simon Thompson Chief Researcher, Customer Experience. BT Research. BT plc. PP11J. MLBG BT Adastral Park, Martlesham Heath. IP5 3RE Note : This email contains

Re: HMM - baum welch and hmmpredict

2013-01-06 Thread Dhruv Kumar
Hi Simon, Are you using the standalone HMM trainer or are you running with the MapReduce variant using the patch available at https://issues.apache.org/jira/browse/MAHOUT-627? As Ted mentioned, these trainers can experience arithmetic underflow when the set of states is large. Did you try the

RE: HMM - baum welch and hmmpredict

2013-01-06 Thread simon.2.thompson
Hi Ted, thanks very much for the response, very helpful to hear these thoughts. What I will do is look at the data set issue and report back as to what I find out. I'll prod round the code and see if I can get a clue as to how it produces infinities and so on. I think that one of the Mahout

Re: HMM - baum welch and hmmpredict

2013-01-06 Thread Ted Dunning
It sounds like you are getting some numerical stability issues with the training program. With HMM's, the most common problem that leads to this is numerical underflow. I haven't looked at this in detail, however, so I can't comment very knowledgeably. It is possible that the current implementat