Viktor's work does not replace minBy or maxBy, right (they are not agg
functions, but selectors)? I think we would retain those anyways. They also
make little sense in composite aggregations, because you can always only
use one of them...

So, the question is whether to keep min() max() or to have that only
accessible through the Viktor aggregations.


On Fri, Nov 28, 2014 at 4:32 PM, Ufuk Celebi <[email protected]> wrote:

> No, I didn't say that I want to overload min/max. I think Viktor's changes
> are exactly in line with what I said. min/max(X) or minBy/maxBy(X) could be
> (maybe deprecated) shortcuts for aggregate(max/min(X)).
>
> On Fri, Nov 28, 2014 at 2:59 PM, Fabian Hueske <[email protected]> wrote:
>
>> I am not sure about this. With the new aggregations that Viktor is
>> working on, things become pretty obvious IMO.
>>
>> data.aggregate(count(), sum(2), min(0));
>> basically shows the structure of the result.
>>
>> I would not go and overload min() and max() in different contexts.
>>
>> 2014-11-28 14:53 GMT+01:00 Ufuk Celebi <[email protected]>:
>>
>>> This is not the first time that people confused this. I think most
>>> people expect the maxBy and minBy behaviour for max/min.
>>>
>>> Maybe it makes sense to move back to the old aggregations API, where you
>>> call the aggregate method and specify as an argument, which type of
>>> aggregation should be performed. I didn't really like this, but if the
>>> current state is confusing people, we should consider to change it again.
>>>
>>> On Fri, Nov 28, 2014 at 12:31 PM, Maximilian Alber <
>>> [email protected]> wrote:
>>>
>>>> Hi Fabian!
>>>>
>>>> Ok, thanks! Now it works.
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>> On Fri, Nov 28, 2014 at 1:47 AM, Fabian Hueske <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Max,
>>>>>
>>>>> the max(i) function does not select the Tuple with the maximum value.
>>>>> Instead, it builds a new Tuple with the maximum value for the i-th
>>>>> attribute. The values of the Tuple's other fields are not defined (in
>>>>> practice they are set to the value of the last Tuple, however the order of
>>>>> Tuples is not defined).
>>>>>
>>>>> The Java API features minBy and maxBy transformations that should do
>>>>> what you are looking for.
>>>>> You can reimplement them for Scala as a simple GroupReduce (or Reduce)
>>>>> function or use the Java function in you Scala code.
>>>>>
>>>>> Best, Fabian
>>>>>
>>>>>
>>>>>
>>>>> 2014-11-27 16:14 GMT+01:00 Maximilian Alber <
>>>>> [email protected]>:
>>>>>
>>>>>> Hi Flinksters,
>>>>>>
>>>>>> I don't if I made something wrong, but the code seems fine. Basically
>>>>>> the max function does extract a wrong element.
>>>>>>
>>>>>> The error does just happen with my real data, not if I inject some
>>>>>> sequence into costs.
>>>>>>
>>>>>> The problem is that the according tuple value at position is wrong.
>>>>>> The maximum of the second part is detected correctly.
>>>>>>
>>>>>> The code snippet:
>>>>>>
>>>>>> val maxCost = costs map {x => (x.id, x.value)} max(1)
>>>>>>
>>>>>> (costs map {x => (x.id, x.value)} map {_ toString} map {"first: "+ _
>>>>>> }) union (maxCost map {_ toString} map {"second: "+ _ }) writeAsText
>>>>>> config.outFile
>>>>>>
>>>>>> The output:
>>>>>>
>>>>>> File content:
>>>>>> first: (47,42.066986)
>>>>>> first: (11,4.448255)
>>>>>> first: (40,42.06696)
>>>>>> first: (3,0.96731037)
>>>>>> first: (31,42.06443)
>>>>>> first: (18,23.753584)
>>>>>> first: (45,42.066986)
>>>>>> first: (24,41.44347)
>>>>>> first: (13,6.1290965)
>>>>>> first: (19,26.42948)
>>>>>> first: (1,0.9665109)
>>>>>> first: (28,42.04222)
>>>>>> first: (5,1.2986814)
>>>>>> first: (44,42.066986)
>>>>>> first: (7,1.8681992)
>>>>>> first: (10,3.0981758)
>>>>>> first: (41,42.066982)
>>>>>> first: (48,42.066986)
>>>>>> first: (21,33.698544)
>>>>>> first: (38,42.066963)
>>>>>> first: (30,42.06153)
>>>>>> first: (26,41.950237)
>>>>>> first: (43,42.066986)
>>>>>> first: (16,14.754578)
>>>>>> first: (15,10.571205)
>>>>>> first: (34,42.06672)
>>>>>> first: (29,42.055424)
>>>>>> first: (35,42.066845)
>>>>>> first: (8,1.9513339)
>>>>>> first: (22,38.189228)
>>>>>> first: (46,42.066986)
>>>>>> first: (2,0.966511)
>>>>>> first: (27,42.013676)
>>>>>> first: (12,5.4271784)
>>>>>> first: (42,42.066986)
>>>>>> first: (4,1.01561)
>>>>>> first: (14,7.4410205)
>>>>>> first: (25,41.803535)
>>>>>> first: (6,1.6827519)
>>>>>> first: (36,42.06694)
>>>>>> first: (20,28.834095)
>>>>>> first: (32,42.06577)
>>>>>> first: (49,42.066986)
>>>>>> first: (33,42.0664)
>>>>>> first: (9,2.2420964)
>>>>>> first: (37,42.066967)
>>>>>> first: (0,0.9665109)
>>>>>> first: (17,19.016153)
>>>>>> first: (39,42.06697)
>>>>>> first: (23,40.512672)
>>>>>> second: (23,42.066986)
>>>>>>
>>>>>> File content end.
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>> Cheers,
>>>>>> Max
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to