----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23384/#review51343 -----------------------------------------------------------
datafu-pig/src/main/java/datafu/pig/util/NumericalRange.java <https://reviews.apache.org/r/23384/#comment89550> Seems like there should be validation of the args in this constructor, or a helper method should be called to validate them. datafu-pig/src/main/java/datafu/pig/util/NumericalRange.java <https://reviews.apache.org/r/23384/#comment89552> Does it make sense for negative infinity to be closed? Otherwise the following will return true, which doesn't seem right mathematically (but may be a pragmatic choice in Java). range.isInRange(Double.NEGATIVE_INFINITY) In general I wonder whether how we should handle the open/closed issue for infinite values. Either of these are possible: 1) Infinite values can be closed only 2) Infinite values can be open or closed 3) Infinite values can be open only In the context of NDCG this debate is irrelevant because it's impossible to have a bag of values so large that you reach an "infinite" number of them. However if we want to use NumericalRange for other applications we should give consideration to this. There is some interesting behavior that can result from infinity being open/closed. See the test case below, which passes with the current code. @Test public void testInfinityEdgeCases() { NumericalRange range = NumericalRange.fromRangeString("[*,*]"); assertEquals(range.isInRange(Double.POSITIVE_INFINITY), true); assertEquals(range.isInRange(Math.pow(1e100,1e100)), true); range = NumericalRange.fromRangeString("[*,*)"); assertEquals(range.isInRange(Double.POSITIVE_INFINITY), false); assertEquals(range.isInRange(Math.pow(1e100,1e100)), false); } Maybe there is value in being able to express either [0,*] or [0,*). I just wanted to point out what code is possible as a result. At the very least I think we should make the two ends of "*" either both closed or both open. Currently "*" means [*,*), which means it would match negative infinity but not positive infinity. datafu-pig/src/main/java/datafu/pig/util/NumericalRange.java <https://reviews.apache.org/r/23384/#comment89551> This sort of validation should happen in the constructor. datafu-pig/src/main/java/datafu/pig/util/RangeMap.java <https://reviews.apache.org/r/23384/#comment89554> The code for this is mostly the same as the "get" method. Can you call "get" and check for a non-null value instead? datafu-pig/src/main/java/datafu/pig/util/RangeMap.java <https://reviews.apache.org/r/23384/#comment89549> One could argue that this doesn't make sense either. - Matthew Hayes On July 21, 2014, 5:06 p.m., Joshua Hartman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/23384/ > ----------------------------------------------------------- > > (Updated July 21, 2014, 5:06 p.m.) > > > Review request for DataFu and Matthew Hayes. > > > Repository: datafu > > > Description > ------- > > DATAFU-60 Support for NDCG > > > Diffs > ----- > > > datafu-pig/src/main/java/datafu/pig/stats/ExponentialWeightedLog2ScoringFunction.java > PRE-CREATION > datafu-pig/src/main/java/datafu/pig/stats/Log2ScoringFunction.java > PRE-CREATION > datafu-pig/src/main/java/datafu/pig/stats/Ndcg.java PRE-CREATION > datafu-pig/src/main/java/datafu/pig/stats/PositionScoringFunction.java > PRE-CREATION > datafu-pig/src/main/java/datafu/pig/stats/RangeScoringFunction.java > PRE-CREATION > datafu-pig/src/main/java/datafu/pig/stats/UnaryScoringFunction.java > PRE-CREATION > datafu-pig/src/main/java/datafu/pig/util/NumericalRange.java PRE-CREATION > datafu-pig/src/main/java/datafu/pig/util/RangeMap.java PRE-CREATION > datafu-pig/src/test/java/datafu/test/pig/stats/NdcgTests.java PRE-CREATION > datafu-pig/src/test/java/datafu/test/pig/util/TestRange.java PRE-CREATION > datafu-pig/src/test/java/datafu/test/pig/util/TestRangeMap.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/23384/diff/ > > > Testing > ------- > > Unit tests, Pig tests attached > > > Thanks, > > Joshua Hartman > >