I am not sure about this. With the new aggregations that Viktor is working on, things become pretty obvious IMO.
data.aggregate(count(), sum(2), min(0)); basically shows the structure of the result. I would not go and overload min() and max() in different contexts. 2014-11-28 14:53 GMT+01:00 Ufuk Celebi <[email protected]>: > This is not the first time that people confused this. I think most people > expect the maxBy and minBy behaviour for max/min. > > Maybe it makes sense to move back to the old aggregations API, where you > call the aggregate method and specify as an argument, which type of > aggregation should be performed. I didn't really like this, but if the > current state is confusing people, we should consider to change it again. > > On Fri, Nov 28, 2014 at 12:31 PM, Maximilian Alber < > [email protected]> wrote: > >> Hi Fabian! >> >> Ok, thanks! Now it works. >> >> Cheers, >> Max >> >> On Fri, Nov 28, 2014 at 1:47 AM, Fabian Hueske <[email protected]> >> wrote: >> >>> Hi Max, >>> >>> the max(i) function does not select the Tuple with the maximum value. >>> Instead, it builds a new Tuple with the maximum value for the i-th >>> attribute. The values of the Tuple's other fields are not defined (in >>> practice they are set to the value of the last Tuple, however the order of >>> Tuples is not defined). >>> >>> The Java API features minBy and maxBy transformations that should do >>> what you are looking for. >>> You can reimplement them for Scala as a simple GroupReduce (or Reduce) >>> function or use the Java function in you Scala code. >>> >>> Best, Fabian >>> >>> >>> >>> 2014-11-27 16:14 GMT+01:00 Maximilian Alber <[email protected]> >>> : >>> >>>> Hi Flinksters, >>>> >>>> I don't if I made something wrong, but the code seems fine. Basically >>>> the max function does extract a wrong element. >>>> >>>> The error does just happen with my real data, not if I inject some >>>> sequence into costs. >>>> >>>> The problem is that the according tuple value at position is wrong. The >>>> maximum of the second part is detected correctly. >>>> >>>> The code snippet: >>>> >>>> val maxCost = costs map {x => (x.id, x.value)} max(1) >>>> >>>> (costs map {x => (x.id, x.value)} map {_ toString} map {"first: "+ _ >>>> }) union (maxCost map {_ toString} map {"second: "+ _ }) writeAsText >>>> config.outFile >>>> >>>> The output: >>>> >>>> File content: >>>> first: (47,42.066986) >>>> first: (11,4.448255) >>>> first: (40,42.06696) >>>> first: (3,0.96731037) >>>> first: (31,42.06443) >>>> first: (18,23.753584) >>>> first: (45,42.066986) >>>> first: (24,41.44347) >>>> first: (13,6.1290965) >>>> first: (19,26.42948) >>>> first: (1,0.9665109) >>>> first: (28,42.04222) >>>> first: (5,1.2986814) >>>> first: (44,42.066986) >>>> first: (7,1.8681992) >>>> first: (10,3.0981758) >>>> first: (41,42.066982) >>>> first: (48,42.066986) >>>> first: (21,33.698544) >>>> first: (38,42.066963) >>>> first: (30,42.06153) >>>> first: (26,41.950237) >>>> first: (43,42.066986) >>>> first: (16,14.754578) >>>> first: (15,10.571205) >>>> first: (34,42.06672) >>>> first: (29,42.055424) >>>> first: (35,42.066845) >>>> first: (8,1.9513339) >>>> first: (22,38.189228) >>>> first: (46,42.066986) >>>> first: (2,0.966511) >>>> first: (27,42.013676) >>>> first: (12,5.4271784) >>>> first: (42,42.066986) >>>> first: (4,1.01561) >>>> first: (14,7.4410205) >>>> first: (25,41.803535) >>>> first: (6,1.6827519) >>>> first: (36,42.06694) >>>> first: (20,28.834095) >>>> first: (32,42.06577) >>>> first: (49,42.066986) >>>> first: (33,42.0664) >>>> first: (9,2.2420964) >>>> first: (37,42.066967) >>>> first: (0,0.9665109) >>>> first: (17,19.016153) >>>> first: (39,42.06697) >>>> first: (23,40.512672) >>>> second: (23,42.066986) >>>> >>>> File content end. >>>> >>>> >>>> Thanks! >>>> Cheers, >>>> Max >>>> >>>> >>> >> >
