Re: sorting groups

2015-06-16 Thread Michele Bertoni
Hi Fabian, My dataset is of this type RegionType (Long, String, Long, Long, Char, Array[GValue]) Where GValue is a case class implemented by GString(v:String) GDouble(v:Double) I have two case of sorting: In the first (topk) i have to group by the first field of the regions and sort by a set of f

Memory in local setting

2015-06-16 Thread Sebastian
Hi, Flink has memory problems when I run an algorithm from my local IDE on a 2GB graph. Is there any way that I can increase the memory given to Flink? Best, Sebastian Caused by: java.lang.RuntimeException: Memory ran out. numPartitions: 32 minPartition: 4 maxPartition: 4 number of overflow

Re: passing variable to filter function

2015-06-16 Thread Fabian Hueske
Hi, which version of Flink are you working with? The master (0.9-SNAPSHOT) has a RichFilterFunction [1]. Best, Fabian [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/functions/RichFilterFunction.java 2015-06-16 23:52 GMT+02:00 Vinh June : >

passing variable to filter function

2015-06-16 Thread Vinh June
Hello, How do you pass a parameter to a filter function? With Map, Join, I can use withBroadcastSet to pass to RichMapFunction or RichJoinFunction, but with filter, how can I pass it ? I would like to pass the variable to be able to use as in line 60 here http://pastebin.com/cFZVCLGZ Thanks

Re: sorting groups

2015-06-16 Thread Fabian Hueske
Hi, the error is related to the way you specify the grouping and the sorting key. The API is currently restricted in the way, that you can only use a key selector function for the sorting key if you also used a selector function for the grouping key. In Scala the use of key selector functions is

sorting groups

2015-06-16 Thread Michele Bertoni
Hi everybody, I am trying to sorting a grouped dataset, but i am getting this error: Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: Sorting on KeySelector keys only works with KeySelector grouping. at org.apache.flink.api.scala.GroupedDataSet.sortGroup(Gr

Re: Choosing random element

2015-06-16 Thread Sachin Goel
Hi If you're looking for an implementation, I wrote this in context of a Random k-means initialization. Take a look at the {{weightedFit}} function. https://github.com/sachingoel0101/flink/blob/clustering_initializations/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/clustering/KMeansRan

Re: Choosing random element

2015-06-16 Thread Maximilian Alber
Thanks! Cheers, Max On Tue, Jun 16, 2015 at 11:01 AM, Till Rohrmann wrote: > This might help you [1]. > > Cheers, > Till > > [1] > http://stackoverflow.com/questions/2514061/how-to-pick-random-small-data-samples-using-map-reduce > > > On Tue, Jun 16, 2015 at 10:16 AM Maximilian Alber < > alber.m

Re: Choosing random element

2015-06-16 Thread Till Rohrmann
This might help you [1]. Cheers, Till [1] http://stackoverflow.com/questions/2514061/how-to-pick-random-small-data-samples-using-map-reduce On Tue, Jun 16, 2015 at 10:16 AM Maximilian Alber < alber.maximil...@gmail.com> wrote: > Hi Flinksters, > > again a similar problem. I would like to choose

Re: Help with Flink experimental Table API

2015-06-16 Thread Aljoscha Krettek
Yes, what I meant was to have a single bit mask that is written before all the fields are written. Then, for example, 1011 would mean that field 1, 2, and 4 are non-null while field 3 is null. On Tue, 16 Jun 2015 at 10:24 Shiti Saxena wrote: > Can we use 0(false) and 1(true)? > > On Tue, Jun 16,

Re: Help with Flink experimental Table API

2015-06-16 Thread Shiti Saxena
Can we use 0(false) and 1(true)? On Tue, Jun 16, 2015 at 1:32 PM, Aljoscha Krettek wrote: > One more thing, it would be good if the TupleSerializer didn't write a > boolean for every field. A single integer could be used where one bit > specifies if a given field is null or not. (Maybe we should

Choosing random element

2015-06-16 Thread Maximilian Alber
Hi Flinksters, again a similar problem. I would like to choose ONE random element out of a data set, without shuffling the whole set. Again I would like to have the element (mathematically) randomly chosen. Thanks! Cheers, Max

Re: "No space left on device" IOException when using Cross operator

2015-06-16 Thread Mihail Vieru
Hi Stephan, thank you for your explanation. I thought I will be getting just 100MB of results after the Cross. This is why I used it. I will try something else then, most possibly a Map on the input. Best, Mihail On 16.06.2015 04:27, Stephan Ewen wrote: Cross is a quadratic operation. As such

Re: Help with Flink experimental Table API

2015-06-16 Thread Aljoscha Krettek
One more thing, it would be good if the TupleSerializer didn't write a boolean for every field. A single integer could be used where one bit specifies if a given field is null or not. (Maybe we should also add this to the RowSerializer in the future.) On Tue, 16 Jun 2015 at 07:30 Aljoscha Krettek