Re: Nested Iterations Outlook

2015-07-22 Thread Maximilian Alber
run into merge conflicts. Also keep in >>> mind that it is work in progress :) >>> >>> On Mon, Jul 20, 2015 at 4:15 PM, Maximilian Alber < >>> alber.maximil...@gmail.com> wrote: >>> >>>> Thanks! >>>> >>>> Ok, cool.

Re: Nested Iterations Outlook

2015-07-20 Thread Maximilian Alber
submit to the cluster again. > > Cheers, > Max > > On Mon, Jul 20, 2015 at 3:31 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Oh sorry, my fault. When I wrote it, I had iterations in mind. >> >> What I actually wanted to say, how &qu

Re: Nested Iterations Outlook

2015-07-20 Thread Maximilian Alber
d iterations. > > Hope this clarifies. Otherwise, please restate your question because I > might have misunderstood. > > Cheers, > Max > > > On Mon, Jul 20, 2015 at 12:11 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Thanks for the answ

Re: Nested Iterations Outlook

2015-07-20 Thread Maximilian Alber
or nested iterations. > > Cheers, > Max > > On Fri, Jul 17, 2015 at 4:26 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Hi Flinksters, >> >> as far as I know, there is still no support for nested iterations >> planned. Am I right?

Nested Iterations Outlook

2015-07-17 Thread Maximilian Alber
Hi Flinksters, as far as I know, there is still no support for nested iterations planned. Am I right? So my question is how such use cases should be handled in the future. More specific: when pinning/caching will be available, you suggest to use that feature and program in "Spark" style? Or is th

Re: Scala: registerAggregationConvergenceCriterion

2015-07-17 Thread Maximilian Alber
, then the > iteration will be terminated. If not and if the maximum number of > iterations has not been exceeded, then the next iteration is started. > > Cheers, > Till > ​ > > On Fri, Jul 17, 2015 at 3:43 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrot

Scala: registerAggregationConvergenceCriterion

2015-07-17 Thread Maximilian Alber
Hi Flinksters, I try to use BulkIterations with a convergence criterion. Unfortunately, I'm not sure how to use them and I couldn't find a nice example. Here are two code snippets and the resulting error, maybe someone can help. I'm working on the current branch. Example1: if(true){ val d

Python vs Scala - Performance

2015-06-29 Thread Maximilian Alber
Hi Flinksters, we had recently a discussion in our working group which Language we should use with Flink. To bring it to the point: most people would like to use Python because the are familiar with it and there is a nice scientific stack to f.e. print and analyse the results. But our concern is t

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
Hi Robert, thanks for the offer. At the moment I'm to busy. But maybe when we begin to use Flink for ML in the BBDC. Cheers, Max On Thu, Jun 25, 2015 at 2:48 PM, Robert Metzger wrote: > Hey Maximilian Alber, > I don't know if you are interested in contributing in Flink, bu

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
e already familiar > with other big data projects. It would be great if somebody could spare > time to work on that. > > On Thu, Jun 25, 2015 at 2:31 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Something different. I just read through the Spark

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
FAQs. The two > have already diverged quite a bit and merging them is not trivial. > > > > On Thu, Jun 25, 2015 at 11:40 AM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > > Another one: on > > http://ci.apache.org/projects/flink/flink-docs-master/faq.html

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
master and for the 0.9.1 release. > > Cheers, > Max > > On Tue, Jun 23, 2015 at 5:09 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Hi Flinksters, >> >> just some minor: >> >> http://ci.apache.org/projects/flink/fl

Re: Random Shuffling

2015-06-24 Thread Maximilian Alber
such as sampling with replacement. > > --sebastian > > > On 24.06.2015 10:38, Maximilian Alber wrote: > >> That's not the point. In Machine Learning one often divides a data set X >> into f.e. three sets, one for the training, one for the validation, one >>

Re: Random Shuffling

2015-06-24 Thread Maximilian Alber
whole process randomly. Cheers, Max On Wed, Jun 24, 2015 at 9:51 AM, Stephan Ewen wrote: > If you do "rebalance()", it will redistribute elements round-robin > fashion, which should give you very even partition sizes. > > > On Tue, Jun 23, 2015 at 11:51 AM, Maxi

Documentation Error

2015-06-23 Thread Maximilian Alber
Hi Flinksters, just some minor: http://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html in the second code sample should be ./bin/flink run -m yarn-cluster -yn 4 -yjm 1024 -ytm 4096 ./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar instead of: ./bin/flink -m yarn-clu

Re: Building Yarn-Version of Flink

2015-06-23 Thread Maximilian Alber
; when you build for Hadoop 1: -Dhadoop.profile=1. > > Cheers, > The other Max > > On Tue, Jun 23, 2015 at 2:03 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Hi Flinksters, >> >> I just tried to build the current yarn version of Flink. The s

Building Yarn-Version of Flink

2015-06-23 Thread Maximilian Alber
Hi Flinksters, I just tried to build the current yarn version of Flink. The second error is probably a because maven is of an older version. But the first one seems to be an error. albermax@hadoop1:~/bumpboost/working/programs/flink/incubator-flink$ mvn clean package -DskipTests -Dhadoop.profile=

Re: Random Shuffling

2015-06-23 Thread Maximilian Alber
Partitions); > > } > > should do the trick. > > -Matthias > > On 06/15/2015 05:41 PM, Maximilian Alber wrote: > > Thanks! > > > > Ok, so for a random shuffle I need partitionCustom. But in that case the > > data might be out of balance then? > > > >

Re: Random Selection

2015-06-23 Thread Maximilian Alber
hat one, by any chance? > > On Mon, Jun 15, 2015 at 5:31 PM, Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Hi everyone! >> Thanks! It seems the variable that makes the problems. Making an inner >> class solved the issue. >> Cheers, >>

Re: Choosing random element

2015-06-16 Thread Maximilian Alber
Thanks! Cheers, Max On Tue, Jun 16, 2015 at 11:01 AM, Till Rohrmann wrote: > This might help you [1]. > > Cheers, > Till > > [1] > http://stackoverflow.com/questions/2514061/how-to-pick-random-small-data-samples-using-map-reduce > > > On Tue, Jun 16, 201

Choosing random element

2015-06-16 Thread Maximilian Alber
Hi Flinksters, again a similar problem. I would like to choose ONE random element out of a data set, without shuffling the whole set. Again I would like to have the element (mathematically) randomly chosen. Thanks! Cheers, Max

Re: Random Shuffling

2015-06-15 Thread Maximilian Alber
filter the individual splits. Depending on you split ID distribution you > will have differently sized splits. > > Cheers, > Till > > On Mon, Jun 15, 2015 at 1:50 PM Maximilian Alber > alber.maximil...@gmail.com <http://mailto:alber.maximil...@gmail.com> > wrote: > >

Re: Random Selection

2015-06-15 Thread Maximilian Alber
mpanion object of > scala.util.Random. Try to create an instance of the scala.util.Random > class and use this instance within your RIchFilterFunction to generate > the random numbers. > > Cheers, > Till > > On Mon, Jun 15, 2015 at 1:56 PM Maximilian Alber > alber.maximil.

Random Selection

2015-06-15 Thread Maximilian Alber
Hi Flinksters, I would like to randomly choose a element of my data set. But somehow I cannot use scala.util inside my filter functions: val sample_x = X filter(new RichFilterFunction[Vector](){ var i: Int = -1 override def open(config: Configuration) = { i = scal

Random Shuffling

2015-06-15 Thread Maximilian Alber
Hi Flinksters, I would like to shuffle my elements in the data set and then split it in two according to some ratio. Each element in the data set has an unique id. Is there a nice way to do it with the flink api? (It would be nice to have guaranteed random shuffling.) Thanks! Cheers, Max

Re: GroupedDataset collect

2015-06-11 Thread Maximilian Alber
q(0)._2 > val values = seq.map(_._1) > > out.collect((key, values)) > } > }.collect() > > Then you can collect the data as (key1, (values, …), (key2, (values, …), > (key3, (values, …), ... > > Regards, > Chiwan Park > > > On Jun 11, 2015, at 11:01

Re: Flink-ML as Dependency

2015-06-11 Thread Maximilian Alber
Well then, I should update ;-) On Thu, Jun 11, 2015 at 4:01 PM, Till Rohrmann wrote: > Hmm then I assume that version 2 can properly handle maven property > variables. > > > On Thu, Jun 11, 2015 at 3:05 PM Maximilian Alber < > alber.maximil...@gmail.com> wrote: >

GroupedDataset collect

2015-06-11 Thread Maximilian Alber
Hi Flinksters, I tried to call collect on a grouped data set, somehow it did not work. Is this intended? If yes, why? Code snippet: // group a data set according to second field: val grouped_ds = cross_ds.groupBy(1) println("After groupBy: "+grouped_ds.collect()) Error: [ant:scalac]

Re: Flink-ML as Dependency

2015-06-11 Thread Maximilian Alber
and check whether the > error still occurs? Simple call mvn clean install -DskipTests > -Dmaven.javadoc.skip=true from the root directory of Flink. > > Cheers, > Till > > On Wed, Jun 10, 2015 at 3:38 PM Maximilian Alber > alber.maximil...@gmail.com <http://mailto:alber.

Re: Flink-ML as Dependency

2015-06-11 Thread Maximilian Alber
ariable manually and set it to 2.10. > > Cheers, > Till > ​ > > On Wed, Jun 10, 2015 at 3:38 PM Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Hi Flinksters, >> >> I would like to test FlinkML. Unfortunately, I get an error when >> in

Flink-ML as Dependency

2015-06-10 Thread Maximilian Alber
Hi Flinksters, I would like to test FlinkML. Unfortunately, I get an error when including it to my project. It might be my error as I'm not experienced with Gradle, but with Google I got any wiser. My build.gradle looks as follows: apply plugin: 'java' apply plugin: 'scala' //sourceCompatibilit