Ok great thanks for the help!

________________________________
From: Alex Herbert <alex.d.herb...@gmail.com>
Sent: Tuesday, March 5, 2024 11:35 AM
To: Commons Users List <user@commons.apache.org>
Subject: [External] - Re: MultivariateNormalMixtureExpectationMaximization only 
1 dimension

[You don't often get email from alex.d.herb...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


I have updated the master branch with a change to allow fitting a mixture
with 1-column data.

You should be able to pick up the 4.0-SNAPSHOT from the ASF snapshots repo
if you configure your build to add the snapshot repository (see [1]).

Let us know if this works for you. Note that if you only require fitting 1
column data then you would be able to optimise the implementation as it
will no longer require matrix inversion to compute the mixture probability
distribution. The CM implementation can act as a reference point for your
own implementation if desired.

Regards,

Alex

[1]
https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Fsnapshots%2Forg%2Fapache%2Fcommons%2Fcommons-math4-legacy%2F4.0-SNAPSHOT%2F&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=pV5bELVx3%2FwNJ0LADZVQHv4Mf0UZEWq5GdwTFJTTyP0%3D&reserved=0<https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-math4-legacy/4.0-SNAPSHOT/>

On Tue, 5 Mar 2024 at 00:06, Alex Herbert <alex.d.herb...@gmail.com> wrote:

> Hi,
>
> I think this is a bug in the
> MultivariateNormalMixtureExpectationMaximization class. When I update the
> code to allow 1 column in the rows it outputs a similar fit to matlab.
> Here's an example of Matlab:
>
> X = [normrnd(0, 1, 100, 1); normrnd(2, 2, 100, 1)]
> GMModel = fitgmdist(X,2);
>
> >> GMModel.mu
> ans =
>     0.0737
>     3.0914
> >> GMModel.ComponentProportion
> ans =
>     0.6750    0.3250
> >> GMModel.Sigma
> ans(:,:,1) =
>     1.0505
> ans(:,:,2) =
>     1.6593
>
> I pasted the same X data into a test for
> MultivariateNormalMixtureExpectationMaximization that had been updated to
> allow data with a single column and get the following fit:
>
> MultivariateNormalMixtureExpectationMaximization fitter
>     = new MultivariateNormalMixtureExpectationMaximization(data);
>
> MixtureMultivariateNormalDistribution initialMix
>     = MultivariateNormalMixtureExpectationMaximization.estimate(data, 2);
> fitter.fit(initialMix);
> MixtureMultivariateNormalDistribution fittedMix = fitter.getFittedModel();
> List<Pair<Double, MultivariateNormalDistribution>> components =
> fittedMix.getComponents();
>
> for (Pair<Double, MultivariateNormalDistribution> component : components) {
>     final double weight = component.getFirst();
>     final MultivariateNormalDistribution mvn = component.getSecond();
>     final double[] mean = mvn.getMeans();
>     final RealMatrix covMat = mvn.getCovariances();
>     System.out.printf("%s : %s : %s%n", weight, Arrays.toString(mean),
> covMat.toString());
> }
>
> 0.6420433138817465 : [0.016942587744259194] :
> Array2DRowRealMatrix{{0.9929681356}}
> 0.3579566861182536 : [2.9152176347671754] :
> Array2DRowRealMatrix{{1.8940290549}}
>
> The numbers are close enough to indicate that the fit is valid.
>
> I think the error has been in assuming that because you require 2
> components to have a mixture model then you must have 2 columns in the
> input data. However this is not true. You can fit single dimension data
> with a mixture of single Gaussians.
>
> Is this the functionality that you are expecting?
>
> Regards,
>
> Alex
>
>
> On Mon, 4 Mar 2024 at 20:48, Craig Brautigam <cbrauti...@icr-team.com>
> wrote:
>
>> Forgive me if this comes in twice... I did not subscribe first before
>> sending the message below.
>>
>>
>> ________________________________
>> From: Craig Brautigam
>> Sent: Monday, March 4, 2024 1:33 PM
>> To: user@commons.apache.org <user@commons.apache.org>
>> Subject: MultivariateNormalMixtureExpectationMaximization only 1 dimension
>>
>> Hi,
>>
>> Full disclosure, I'm not a mathematician so I can not go into the weeds
>> into the math.  However I am tasked with porting some matlab code that is
>> doing gaussian mixed model to java.  I really want to use apache common
>> math if possible.  However the code that I'm porting has 1 dimension ( a
>> single variable/attribute/property) that GMMs are being created from.
>>
>> MultivariateNormalMixtureExpectationMaximization looks to be a pretty
>> close drop in replacement for the matlab functions
>> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0<<https://www.mathworks.com/help/stats/fitgmdist.html>
>> https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Ffitgmdist.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Mro7wLtSPNZ%2BvlTzgFkdtjwXDrVvw9YJwLGpXij7qNo%3D&reserved=0><https://www.mathworks.com/help/stats/fitgmdist.html>
>>  andhttps://
>> https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2Fwww.mathworks.com%2Fhelp%2Fstats%2Fgmdistribution.html&data=05%7C02%7Ccbrautigam%40icr-team.com%7Cbb1041fe6b994488070808dc3d431216%7C3d860a84424d44f9ab2bc61a036b4904%7C0%7C0%7C638452605500058423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=zsj4iQQmeOUd9ZmleDuu8TB5AM%2BU82hoGBg0kJD541w%3D&reserved=0<http://www.mathworks.com/help/stats/gmdistribution.html>,
>>  however the
>> constructor for MultivariateNormalMixtureExpectationMaximization clearly
>> states the the number of columns in the double[][]data array MUST be no
>> less thatn2 columns.  I'm completely baffled as to why this is the case if
>> I want to try to fit data with 1 dimension in it.  Is there a workaround I
>> can use like provide a dummy column of data with all 0s to pacify the
>> constructor? Is there another class I should be using?
>>
>> Any help would be greatly appreciated.
>>
>> Thx!
>>
>>
>> ________________________________
>> The information contained in this e-mail and any attachments from ICR,
>> Inc. may contain confidential and/or proprietary information, and is
>> intended only for the named recipient to whom it was originally addressed.
>> If you are not the intended recipient, any disclosure, distribution, or
>> copying of this e-mail or its attachments is strictly prohibited. If you
>> have received this e-mail in error, please notify the sender immediately by
>> return e-mail and permanently delete the e-mail and any attachments.
>>
>
________________________________
The information contained in this e-mail and any attachments from ICR, Inc. may 
contain confidential and/or proprietary information, and is intended only for 
the named recipient to whom it was originally addressed. If you are not the 
intended recipient, any disclosure, distribution, or copying of this e-mail or 
its attachments is strictly prohibited. If you have received this e-mail in 
error, please notify the sender immediately by return e-mail and permanently 
delete the e-mail and any attachments.

Reply via email to