RE: how can Combine between two dataset in on datset and execution more condition in the same time

2015-06-11 Thread hagersaleh
apply not found in flink and how can execute this SELECT employees.last_name FROM employees E, departments D WHERE (D.department_id = E.department_id AND E.job_id = 'AC_ACCOUNT' AND D.location = 2400) OR E.department_id = D.department_id AND E.salary > 6 AND D.location = 2400); -- Vi

RE: Load balancing

2015-06-11 Thread Kruse, Sebastian
Hi Gianmarco, Thanks for the pointer! I had a quick look at the paper, but unfortunately I don’t see a connection to my problem. I have a batch job and elements in my dataset, that need quadratic much processing time depending on their size. The largest ones, that cause higher-than-average loa

RE: how can Combine between two dataset in on datset and execution more condition in the same time

2015-06-11 Thread Kruse, Sebastian
Hi, You might want to translate your SQL statement into an expression of the relational algebra at first [1]. This expression can be expressed with Flink's operators in a straight-forward manner. In the end, it will look something like this: Employees.filter(_.job_id = ...).join(departments.fil

Re: GroupedDataset collect

2015-06-11 Thread Maximilian Alber
Ok! Thank you! Cheers, Max On Thu, Jun 11, 2015 at 4:34 PM, Chiwan Park wrote: > Hi. You cannot collect grouped dataset directly. You can collect grouped > data by using reduceGroup method. > Following code is example: > > import org.apache.flink.util.Collector > val result = grouped_ds.reduceG

Re: Flink-ML as Dependency

2015-06-11 Thread Maximilian Alber
Well then, I should update ;-) On Thu, Jun 11, 2015 at 4:01 PM, Till Rohrmann wrote: > Hmm then I assume that version 2 can properly handle maven property > variables. > > > On Thu, Jun 11, 2015 at 3:05 PM Maximilian Alber < > alber.maximil...@gmail.com> wrote: > >> Hi Till, >> >> I use the stan

Re: GroupedDataset collect

2015-06-11 Thread Chiwan Park
Hi. You cannot collect grouped dataset directly. You can collect grouped data by using reduceGroup method. Following code is example: import org.apache.flink.util.Collector val result = grouped_ds.reduceGroup { (in, out: Collector[(Int, Seq[Int])]) => { val seq = in.toSeq // I assumed t

Re: Flink-ML as Dependency

2015-06-11 Thread Till Rohrmann
Hmm then I assume that version 2 can properly handle maven property variables. On Thu, Jun 11, 2015 at 3:05 PM Maximilian Alber wrote: > Hi Till, > > I use the standard one for Ubuntu 15.04, which is 1.5. > > That did not make any difference. > > Thanks and Cheers, > Max > > On Thu, Jun 11, 2015

GroupedDataset collect

2015-06-11 Thread Maximilian Alber
Hi Flinksters, I tried to call collect on a grouped data set, somehow it did not work. Is this intended? If yes, why? Code snippet: // group a data set according to second field: val grouped_ds = cross_ds.groupBy(1) println("After groupBy: "+grouped_ds.collect()) Error: [ant:scalac]

Re: Flink-ML as Dependency

2015-06-11 Thread Maximilian Alber
Hi Till, I use the standard one for Ubuntu 15.04, which is 1.5. That did not make any difference. Thanks and Cheers, Max On Thu, Jun 11, 2015 at 11:22 AM, Till Rohrmann wrote: > Hi Max, > > I just tested a build using gradle (with your build.gradle file) and some > flink-ml algorithms. And it

how can Combine between two dataset in on datset and execution more condition in the same time

2015-06-11 Thread hagersaleh
how can Combine between two dataset in on datset and execution more condition in the same time Example SELECT employees.last_name FROM employees E, departments D WHERE (D.department_id = E.department_id AND D.location = 2400) AND (E.job_id = 'AC_ACCOUNT' OR E.salary > 6); -- View this m

Re: Flink-ML as Dependency

2015-06-11 Thread Till Rohrmann
Hi Max, I just tested a build using gradle (with your build.gradle file) and some flink-ml algorithms. And it was completed without the problem of the unresolved breeze dependency. I use the version 2.2.1 of Gradle. Which version are you using? Since you’re using Flink’s snapshots and have speci

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Márton Balassi
As for locally I meant the machine that you use for development to see whether this works without parallelism. :-) No need to install stuff on your Namenode of course. Installing Kafka on a machine and having the Kafka Java dependencies available for Flink are two very different things. Try adding

Re: Help with Flink experimental Table API

2015-06-11 Thread Aljoscha Krettek
Cool, good to hear. The PojoSerializer already handles null fields. The RowSerializer can be modified in pretty much the same way. So you should start by looking at the copy()/serialize()/deserialize() methods of PojoSerializer and then modify RowSerializer in a similar way. You can also send me

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Hawin Jiang
Dear Marton What do you meaning for locally Eclipse with 'Run'. Do you want to me to run it on Namenode? But my namenode didn't install Kafka. I only installed Kafka on my data node servers. Do I need to install or copy Kafka jar on Namenode? Actually, I don't want to install everything on Name n

Re: Flink-ML as Dependency

2015-06-11 Thread Maximilian Alber
Hi Till, Thanks for the quick help! Cheers, Max On Wed, Jun 10, 2015 at 5:50 PM, Till Rohrmann wrote: > Hi Max, > > I think the reason is that the flink-ml pom contains as a dependency an > artifact with artifactId breeze_${scala.binary.version}. The variable > scala.binary.version is defined

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Márton Balassi
Dear Hawin, No problem, I am gald that you are giving our Kafka connector a try. :) The dependencies listed look good. Can you run the example locally from Eclipse with 'Run'? I suspect that maybe your Flink cluster does not have the access to the kafka dependency then. As a quick test you could

RE: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Hawin Jiang
Dear Marton Thanks for supporting again. I am running these examples at the same project and I am using Eclipse IDE to submit it to my Flink cluster. Here is my dependencies ** junit

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Márton Balassi
Dear Hawin, This looks like a dependency issue, the java compiler does not find the kafka dependency. How are you trying to run this example? Is it from an IDE or submitting it to a flink cluster with bin/flink run? How do you define your dependencies, do you use maven or sbt for instance? Best,

Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Hawin Jiang
Hi All I am preparing Kafka and Flink performance test now. In order to avoid my mistakes, I have downloaded Kafka example from http://kafka.apache.org/ and Flink streaming Kafka example from http://flink.apache.org I have run both producer examples on the same cluster. No any issues from ka

Re: Help with Flink experimental Table API

2015-06-11 Thread Till Rohrmann
Hi Shiti, here is the issue [1]. Cheers, Till [1] https://issues.apache.org/jira/browse/FLINK-2203 On Thu, Jun 11, 2015 at 8:42 AM Shiti Saxena wrote: > Hi Aljoscha, > > Could you please point me to the JIRA tickets? If you could provide some > guidance on how to resolve these, I will work on