Re: [Insight-developers] OtsuThresholdCalculator versus OtsuMultipleThresholdsCalculator

Bill Lorensen Tue, 09 Jul 2013 08:30:45 -0700

Nice job Dirk.

If you add GetMeasurement, leave GetBinMax for backward compatibility.


Bill


On Tue, Jul 9, 2013 at 11:22 AM, Padfield, Dirk R (GE Global Research) <
[email protected]> wrote:

> Hi All,
>
> Sorry for my delayed reply; I wanted to go through the code so that I
> could give an accurate answer about the differences.  I have now had the
> chance to carefully compare the Otsu and OtsuMultiple code, and I found a
> number of discrepancies that I have fixed.  Here is a summary:
>
> First of all, the equations used for the two filters are different because
> the Otsu uses a simplified mathematical representation that only holds when
> there is one threshold.  I derived the equations to check that they lead to
> the same mathematical representation, and they do.
>
> For all of the tests, I used the 8-bit image
> ExternalData/Testing/Data/Input/cthead1.png.  For each code line below, the
> original line is shown with a "-" first, and the new line is shown with a
> "+" first.  All of the comments are about the Otsu because the MultipleOtsu
> is implemented more accurately.
>
> In Otsu, the global mean was being computed incorrectly.  We can measure
> this because we can compare the value computed from the histogram to the
> true mean of the image pixels, which is 77.76.  The original and corrected
> code are as follows:
> -    totalMean += ( j + 1 ) * relativeFrequency[j];
> +    totalMean += histogram->GetMeasurementVector(j)[0] *
> relativeFrequency[j];
>
> Here is a comparison of the mean calculated for different bin values.
>  Clearly, it should be close to 77.76 for all of them.  Before it was a
> monotonically increasing value, but now it is more consistent (it is never
> exactly 77.76 because of how the histogram bins store the image pixels).
> Bins: 4,8,16,32,64,128,256,512,1024
> Old Mean: 1.95, 3.07, 5.42, 10.36, 20.12, 39.66, 78.76, 156.92, 312.99
> New Mean: 92.95, 81.90, 78.38, 78.56, 78.18, 78.03, 77.96, 77.91, 77.82
>
> The initial left mean was being computed incorrectly.  By mathematical
> definition, when the leftMean is computed from only the first bin of the
> histogram, it is equal to the bin value itself.
> -  double meanLeft = 1.0;
> +  double meanLeft = histogram->GetMeasurementVector(0)[0];
>
> The left mean was computed incorrectly in the loop.  It only made sense
> when the number of bins was the same as the number of gray levels of the
> image itself.
> -                   + ( j + 1 ) * relativeFrequency[j] ) / freqLeft;
> +                   + ( histogram->GetMeasurementVector(j)[0] ) *
> relativeFrequency[j] ) / freqLeft;
>
> The offset of 1 was incorrect in the GetMeasurement call.
> -  this->GetOutput()->Set( static_cast<OutputType>(
> histogram->GetMeasurement( maxBinNumber + 1, 0 ) ) );
> +  this->GetOutput()->Set( static_cast<OutputType>(
> histogram->GetMeasurement( maxBinNumber, 0 ) ) );
>
> Finally, there is a discrepancy in that Otsu uses GetMeasurement (as shown
> above), but OtsuMultiple uses GetBinMax to get the final threshold.  The
> difference is that GetMeasurement is the average of GetBinMax and
> GetBinMin.  When the number of bins is large, this makes almost no
> difference because the GetBinMin and GetBinMax are almost the same.  But
> for small numbers of bins, the GetMeasurement is more stable.  Here are
> some example results:
>
> Bins: 4,8,16,32,64,128,256,512,1024
> GetBinMin:            63.91, 63.83, 63.79, 79.71, 79.70, 81.69, 83.68,
> 83.67, 83.92
> GetBinMax:         127.82, 95.74, 79.74, 87.68, 83.68, 83.68, 84.67,
> 84.17, 84.17
> GetMeasurement: 95.86, 79.79, 71.76, 83.70, 81.69, 82.68, 84.17, 83.92,
> 84.05
>
> Finally, the thresholds should be rounded at the end of the calculator so
> that they are not truncated by the corresponding ImageFilters.
>
> I recommend making all of the changes to the Otsu listed above and
> changing the GetBinMax to GetMeasurement in the OtsuMultiple.  This will
> make the two completely consistent.  The results are as follows for
> different number of bins:
>
> Bins: 4,8,16,32,64,128,256,512,1024
> Old Otsu threshold: 159.77, 111.70, 87.71, 91.67, 85.68, 84.67, 85.17,
> 84.42, 84.30
> New threshold:          95.86, 79.79, 71.76, 83.70, 81.69, 82.68, 84.17,
> 83.92, 84.05
>
> The ImageJ Otsu plugin (
> http://rsbweb.nih.gov/ij/plugins/otsu-thresholding.html) outputs a
> threshold of 79.
> The ImageJ MultipleOtsu plugin (
> http://rsbweb.nih.gov/ij/plugins/multi-otsu-threshold.html) with 1
> threshold outputs a threshold of 87.
> Using Matlab, I get 84.
> So they are not consistent, but they are all close.
>
> Thanks,
> Dirk
>
> ________________________________
> From: Richard Beare [[email protected]]
> Sent: Wednesday, July 03, 2013 6:01 PM
> To: Padfield, Dirk R (GE Global Research)
> Cc: Johnson, Hans J; Matt McCormick; Bradley Lowekamp; <
> [email protected]> Developers; Gaëtan Lehmann
> Subject: Re: [Insight-developers] OtsuThresholdCalculator versus
> OtsuMultipleThresholdsCalculator
>
> Is the difference in threshold when there are a small number of bins
> explainable via the one bin difference? If it isn't then that is cause for
> concern. Otherwise it is just another side effect of the original
> difference. Small numbers of bins can easily break the assumptions in many
> of the histogram based methods, as can data with features like large
> numbers of zeros or maximum values.
>
>
> On Thu, Jul 4, 2013 at 6:17 AM, Padfield, Dirk R (GE Global Research) <
> [email protected]<mailto:[email protected]>> wrote:
> Exactly.  It seems crazy to use more than 256 bins in that case.  But the
> goal of that test was simply timing, and the best way to increase the
> computation time is to increase the number of bins.
>
> The real takeaway message is: either implementation is equally fast for
> reasonably-sized inputs, so there is no timing-based concern for replacing
> one for the other.
>
> Dirk
>
>
> ________________________________________
> From: Johnson, Hans J [[email protected]<mailto:
> [email protected]>]
> Sent: Wednesday, July 03, 2013 3:04 PM
> To: Padfield, Dirk R (GE Global Research); Matt McCormick
> Cc: Bradley Lowekamp; <[email protected]<mailto:
> [email protected]>> Developers; [email protected]<mailto:
> [email protected]>; Gaëtan Lehmann
> Subject: Re: [Insight-developers] OtsuThresholdCalculator versus
> OtsuMultipleThresholdsCalculator
>
> Dirk,
>
> With 10,000,000 bins, and an unsigned char data set this seems suspicious.
>  Are there real needs for such a large number of bins?
>
> I'd hate to spend a lot of time doing optimizations for 10,000,000 bins if
> there is no real-world use case for it.
>
> Hans
>
>
> -----Original Message-----
> From: <Padfield>, "Dirk R   (GE Global Research)"
> <[email protected]<mailto:[email protected]>>
> Date: Wednesday, July 3, 2013 1:27 PM
> To: Matt McCormick <[email protected]<mailto:
> [email protected]>>, Hans Johnson
> <[email protected]<mailto:[email protected]>>
> Cc: Bradley Lowekamp <[email protected]<mailto:[email protected]>>,
> ITK
> <[email protected]<mailto:[email protected]>>, "
> [email protected]<mailto:[email protected]>"
> <[email protected]<mailto:[email protected]>>, Gaëtan Lehmann <
> [email protected]<mailto:[email protected]>>
> Subject: RE: [Insight-developers] OtsuThresholdCalculator versus
> OtsuMultipleThresholdsCalculator
>
> Hi Folks,
>
> I agree as well that we should merge the two implementations and provide
> an update on the migration guide.
>
> I looked in a little more detail at the two implementations.  If I run
> both algorithms for a large range of numberOfHistogramBins, they produce
> very different results when the number is small, but very similar results
> when the number of bins is large.  This is disturbing in itself, so I am
> looking into the implementations to see why this happens.
>
> In terms of the timing, they are both very fast for normal bin sizes.  But
> when I push the numberOfHistogramBins up to 1,000,000 and 10,000,000, I
> get some differences:
>
> numberOfHistogramBins = 1,000,000:
> Otsu = 5 sec
> OtsuMultiple = 0.5 sec
>
> numberOfHistogramBins = 10,000,000:
> Otsu = 45.80 sec
> OtsuMultiple = 4.58 sec
>
> Thus, when the numberOfHistogramBins is very large, the difference is
> significant: the OtsuMultiple is about 10 times faster.  This is
> encouraging since the OtsuMultiple is the more general implementation.
>
> All of these tests are on the cthead1.png image that was used for the
> ctest.
>
> My plan is to carefully compare the implementations to see how to make
> them report the same results.  And then I will merge them so that the Otsu
> is just a shell that inherits from OtsuMultiple.
>
> Thanks,
> Dirk
>
>
> ________________________________________
> From: Matt McCormick [[email protected]<mailto:
> [email protected]>]
> Sent: Tuesday, July 02, 2013 3:44 PM
> To: Johnson, Hans J
> Cc: Bradley Lowekamp; Padfield, Dirk R (GE Global Research);
> <[email protected]<mailto:[email protected]>>
> Developers; [email protected]<mailto:[email protected]>; Gaëtan
> Lehmann
> Subject: Re: [Insight-developers] OtsuThresholdCalculator versus
> OtsuMultipleThresholdsCalculator
>
> I concur with Brad and Hans.
>
> Adding Gaëtan in CC.
>
> Thanks,
> Matt
>
> On Tue, Jul 2, 2013 at 5:08 PM, Johnson, Hans J <[email protected]
> <mailto:[email protected]>>
> wrote:
> > I agree with brad.  The two should produce the same results, even tough
> >it
> > may introduce a different numerical result in the
> > OtsuMultipleThresholdCalculator.
> >
> > -----Original Message-----
> > From: Bradley Lowekamp <[email protected]<mailto:
> [email protected]>>
> > Date: Tuesday, July 2, 2013 12:01 PM
> > To: "Padfield, Dirk R (GE Global Research)" <[email protected]
> <mailto:[email protected]>>
> > Cc: ITK <[email protected]<mailto:[email protected]>>,
> "[email protected]<mailto:[email protected]>"
> > <[email protected]<mailto:[email protected]>>
> > Subject: Re: [Insight-developers] OtsuThresholdCalculator
> > versus  OtsuMultipleThresholdsCalculator
> >
> > Dirk,
> >
> > I would vote to have the two produce the same results, and create a
> >little
> > migration guide which notes the change.
> >
> > Regarding, the refactoring, is there any difference in the algorithm
> > complexity of the two? Any measurements of the performance difference
> > between the two?
> >
> > Brad
> >
> > On Jul 2, 2013, at 11:04 AM, "Padfield, Dirk R (GE Global Research)"
> > <[email protected]<mailto:[email protected]>> wrote:
> >
> >> Hi Richard,
> >>
> >> Thank you for your response and insight.  If anything would need to be
> >>changed, I would also lean towards changing the OtsuMultiple because I
> >>think we shouldn't change Otsu since I am sure many more people are using
> >>the Otsu because it is a standard thresholding algorithm.  I will do some
> >>comparisons of the two against other implementations to see what I find.
> >>
> >> But the question still remains: is it okay to change the OtsuMultiple to
> >>give the same output as Otsu for one threshold?  I am thinking in terms
> >>of those people who use OtsuMultiple whose results will then be slightly
> >>different.  Here are the advantages and disadvantages for keeping things
> >>the same:
> >>
> >> Advantages: everyone's code still works as it did before
> >> Disadvantages: the two implementations are inconsistent with each other
> >>even though they are the same algorithm.  The overlapping code cannot be
> >>merged (inheritance).  And future enhancements will need to be made in
> >>both places.
> >>
> >> My vote is: change OtsuMultiple to be consistent with Otsu.
> >>
> >> What do others think?
> >>
> >> Dirk
> >>
> >>
> >> ________________________________
> >> From: Richard Beare [[email protected]<mailto:
> [email protected]>]
> >> Sent: Monday, July 01, 2013 5:40 PM
> >> To: Bradley Lowekamp
> >> Cc: Padfield, Dirk R (GE Global Research); <[email protected]
> <mailto:[email protected]>>
> >>Developers
> >> Subject: Re: [Insight-developers] OtsuThresholdCalculator versus
> >>OtsuMultipleThresholdsCalculator
> >>
> >> Hi,
> >> I think the order was - I introduced new filters copied from ImageJ,
> >>then Gaetan started refactoring to use the histogram framework. We both
> >>did some work to make that correspond to old versions. I don't remember
> >>working on the MultipleThreshold  version, but the code does look
> >>similar, so perhaps it was done somewhere along the way - will need to
> >>check the logs.
> >>
> >> I'm pretty sure that Otsu was producing the same results that it used to
> >>- I didn't compare to other implementations. Thus, if the original Otsu
> >>was correct then the current one should be too, which would suggest that
> >>the MultipleThresholds version should probably change.
> >>
> >> Not sure when I'll get a chance to look at this in detail.
> >>
> >> I don't have a current email for Gaetan to CC for confirmation.
> >>
> >>
> >> On Mon, Jul 1, 2013 at 11:02 PM, Bradley Lowekamp
> >><[email protected]<mailto:[email protected]><mailto:
> [email protected]<mailto:[email protected]>>> wrote:
> >> Dirk,
> >>
> >> I believe Richard Beare did the refactoring of the thresholding
> >>framework an Insight Journal Article. He will likely know why it is this
> >>way better than anyone else.
> >>
> >> You also didn't say which implementation is correct.
> >>
> >> Brad
> >>
> >>
> >> On Jun 30, 2013, at 10:03 PM, "Padfield, Dirk R (GE Global Research)"
> >><[email protected]<mailto:[email protected]><mailto:
> [email protected]<mailto:[email protected]>>> wrote:
> >>
> >>> Hi ITK Developers,
> >>>
> >>> I was just looking through the OtsuThresholdCalculator and
> >>>OtsuMultipleThresholdsCalculator to see whether I could refactor them so
> >>>that the Otsu inherits from the OtsuMultiple since the latter is a more
> >>>general case of the former.  Currently, the code for these two filters
> >>>is totally different resulting in significant code duplication and a
> >>>need to keep both filters in sync.
> >>>
> >>> As a first step, I wrote a CMake test to check that the output of the
> >>>OtsuMultiple with 1 threshold is the same as the output of the Otsu.
> >>>Unfortunately, they are not!  The two filters output thresholds that are
> >>>different by 1 histogram bin!  This can be a quite extreme difference
> >>>when the numberOfHistogramBins is low, and it leads to different
> >>>thresholds even when the numberOfHistogramBins is reasonably high (say
> >>>256).  I tracked it down to this code in the Calculators:
> >>>
> >>> The relevant code from Otsu:
> >>>  const double tolerance = 0.00001;
> >>>  if ( (varBetween - tolerance) > maxVarBetween )
> >>>    {
> >>>    maxVarBetween = varBetween;
> >>>    maxBinNumber = j;
> >>>    }
> >>>  }
> >>> this->GetOutput()->Set( static_cast<OutputType>(
> >>>histogram->GetMeasurement( maxBinNumber + 1, 0 ) ) );
> >>>
> >>> The relevant code from MultipleOtsu:
> >>>  if ( varBetween > maxVarBetween )
> >>>    {
> >>>    maxVarBetween = varBetween;
> >>>    maxVarThresholdIndexes = thresholdIndexes;
> >>>    }
> >>>  }
> >>> for ( j = 0; j < m_NumberOfThresholds; j++ )
> >>>  {
> >>>  m_Output[j] = histogram->GetBinMax(0, maxVarThresholdIndexes[j]);
> >>>  }
> >>>
> >>> The difference is that the Otsu adds one to the computed threshold
> >>>whereas the MultipleOtsu does not.  This is problematic because users
> >>>would expect them to give the same result.
> >>>
> >>> My question is: how should we proceed?  If we change one or the other,
> >>>people's code that use the changed one will give slightly different
> >>>answers.  If we don't change them, the two filters will give different
> >>>outputs for the same input, and it will not be possible to refactor them
> >>>to share code.
> >>>
> >>> What are your thoughts?
> >>>
> >>> Thanks,
> >>> Dirk
> >>> _______________________________________________
> >>> Powered by www.kitware.com<http://www.kitware.com><
> http://www.kitware.com>
> >>>
> >>> Visit other Kitware open-source projects at
> >>> http://www.kitware.com/opensource/opensource.html
> >>>
> >>> Kitware offers ITK Training Courses, for more information visit:
> >>> http://kitware.com/products/protraining.php
> >>>
> >>> Please keep messages on-topic and check the ITK FAQ at:
> >>> http://www.itk.org/Wiki/ITK_FAQ
> >>>
> >>> Follow this link to subscribe/unsubscribe:
> >>> http://www.itk.org/mailman/listinfo/insight-developers
> >>
> >>
> >
> > _______________________________________________
> > Powered by www.kitware.com<http://www.kitware.com>
> >
> > Visit other Kitware open-source projects at
> > http://www.kitware.com/opensource/opensource.html
> >
> > Kitware offers ITK Training Courses, for more information visit:
> > http://kitware.com/products/protraining.php
> >
> > Please keep messages on-topic and check the ITK FAQ at:
> > http://www.itk.org/Wiki/ITK_FAQ
> >
> > Follow this link to subscribe/unsubscribe:
> > http://www.itk.org/mailman/listinfo/insight-developers
> >
> >
> >
> > ________________________________
> > Notice: This UI Health Care e-mail (including attachments) is covered by
> >the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is
> >confidential and may be legally privileged.  If you are not the intended
> >recipient, you are hereby notified that any retention, dissemination,
> >distribution, or copying of this communication is strictly prohibited.
> >Please reply to the sender that you have received the message in error,
> >then delete it.  Thank you.
> > ________________________________
> > _______________________________________________
> > Powered by www.kitware.com<http://www.kitware.com>
> >
> > Visit other Kitware open-source projects at
> > http://www.kitware.com/opensource/opensource.html
> >
> > Kitware offers ITK Training Courses, for more information visit:
> > http://kitware.com/products/protraining.php
> >
> > Please keep messages on-topic and check the ITK FAQ at:
> > http://www.itk.org/Wiki/ITK_FAQ
> >
> > Follow this link to subscribe/unsubscribe:
> > http://www.itk.org/mailman/listinfo/insight-developers
>
>
>
> ________________________________
> Notice: This UI Health Care e-mail (including attachments) is covered by
> the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is
> confidential and may be legally privileged.  If you are not the intended
> recipient, you are hereby notified that any retention, dissemination,
> distribution, or copying of this communication is strictly prohibited.
>  Please reply to the sender that you have received the message in error,
> then delete it.  Thank you.
> ________________________________
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.php
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers
>



-- 
Unpaid intern in BillsBasement at noware dot com

_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Kitware offers ITK Training Courses, for more information visit:
http://kitware.com/products/protraining.php

Please keep messages on-topic and check the ITK FAQ at:
http://www.itk.org/Wiki/ITK_FAQ

Follow this link to subscribe/unsubscribe:
http://www.itk.org/mailman/listinfo/insight-developers

Re: [Insight-developers] OtsuThresholdCalculator versus OtsuMultipleThresholdsCalculator

Reply via email to