No, I didn't say that I want to overload min/max. I think Viktor's changes are exactly in line with what I said. min/max(X) or minBy/maxBy(X) could be (maybe deprecated) shortcuts for aggregate(max/min(X)).
On Fri, Nov 28, 2014 at 2:59 PM, Fabian Hueske <[email protected]> wrote: > I am not sure about this. With the new aggregations that Viktor is working > on, things become pretty obvious IMO. > > data.aggregate(count(), sum(2), min(0)); > basically shows the structure of the result. > > I would not go and overload min() and max() in different contexts. > > 2014-11-28 14:53 GMT+01:00 Ufuk Celebi <[email protected]>: > >> This is not the first time that people confused this. I think most people >> expect the maxBy and minBy behaviour for max/min. >> >> Maybe it makes sense to move back to the old aggregations API, where you >> call the aggregate method and specify as an argument, which type of >> aggregation should be performed. I didn't really like this, but if the >> current state is confusing people, we should consider to change it again. >> >> On Fri, Nov 28, 2014 at 12:31 PM, Maximilian Alber < >> [email protected]> wrote: >> >>> Hi Fabian! >>> >>> Ok, thanks! Now it works. >>> >>> Cheers, >>> Max >>> >>> On Fri, Nov 28, 2014 at 1:47 AM, Fabian Hueske <[email protected]> >>> wrote: >>> >>>> Hi Max, >>>> >>>> the max(i) function does not select the Tuple with the maximum value. >>>> Instead, it builds a new Tuple with the maximum value for the i-th >>>> attribute. The values of the Tuple's other fields are not defined (in >>>> practice they are set to the value of the last Tuple, however the order of >>>> Tuples is not defined). >>>> >>>> The Java API features minBy and maxBy transformations that should do >>>> what you are looking for. >>>> You can reimplement them for Scala as a simple GroupReduce (or Reduce) >>>> function or use the Java function in you Scala code. >>>> >>>> Best, Fabian >>>> >>>> >>>> >>>> 2014-11-27 16:14 GMT+01:00 Maximilian Alber <[email protected] >>>> >: >>>> >>>>> Hi Flinksters, >>>>> >>>>> I don't if I made something wrong, but the code seems fine. Basically >>>>> the max function does extract a wrong element. >>>>> >>>>> The error does just happen with my real data, not if I inject some >>>>> sequence into costs. >>>>> >>>>> The problem is that the according tuple value at position is wrong. >>>>> The maximum of the second part is detected correctly. >>>>> >>>>> The code snippet: >>>>> >>>>> val maxCost = costs map {x => (x.id, x.value)} max(1) >>>>> >>>>> (costs map {x => (x.id, x.value)} map {_ toString} map {"first: "+ _ >>>>> }) union (maxCost map {_ toString} map {"second: "+ _ }) writeAsText >>>>> config.outFile >>>>> >>>>> The output: >>>>> >>>>> File content: >>>>> first: (47,42.066986) >>>>> first: (11,4.448255) >>>>> first: (40,42.06696) >>>>> first: (3,0.96731037) >>>>> first: (31,42.06443) >>>>> first: (18,23.753584) >>>>> first: (45,42.066986) >>>>> first: (24,41.44347) >>>>> first: (13,6.1290965) >>>>> first: (19,26.42948) >>>>> first: (1,0.9665109) >>>>> first: (28,42.04222) >>>>> first: (5,1.2986814) >>>>> first: (44,42.066986) >>>>> first: (7,1.8681992) >>>>> first: (10,3.0981758) >>>>> first: (41,42.066982) >>>>> first: (48,42.066986) >>>>> first: (21,33.698544) >>>>> first: (38,42.066963) >>>>> first: (30,42.06153) >>>>> first: (26,41.950237) >>>>> first: (43,42.066986) >>>>> first: (16,14.754578) >>>>> first: (15,10.571205) >>>>> first: (34,42.06672) >>>>> first: (29,42.055424) >>>>> first: (35,42.066845) >>>>> first: (8,1.9513339) >>>>> first: (22,38.189228) >>>>> first: (46,42.066986) >>>>> first: (2,0.966511) >>>>> first: (27,42.013676) >>>>> first: (12,5.4271784) >>>>> first: (42,42.066986) >>>>> first: (4,1.01561) >>>>> first: (14,7.4410205) >>>>> first: (25,41.803535) >>>>> first: (6,1.6827519) >>>>> first: (36,42.06694) >>>>> first: (20,28.834095) >>>>> first: (32,42.06577) >>>>> first: (49,42.066986) >>>>> first: (33,42.0664) >>>>> first: (9,2.2420964) >>>>> first: (37,42.066967) >>>>> first: (0,0.9665109) >>>>> first: (17,19.016153) >>>>> first: (39,42.06697) >>>>> first: (23,40.512672) >>>>> second: (23,42.066986) >>>>> >>>>> File content end. >>>>> >>>>> >>>>> Thanks! >>>>> Cheers, >>>>> Max >>>>> >>>>> >>>> >>> >> >
