+1 for removing RelationQuery On Thu, Sep 11, 2014 at 3:04 PM, Aljoscha Krettek <[email protected]> wrote:
> By the way, what was called BatchGradientDescent in the Scala examples > should be replaced by a port of the LinearRegression Example from > Java. I had them as two separate examples earlier. > > What about RelationalQuery and TPC-H-Q3. Any thoughts about removing > RelationalQuery? > > On Thu, Sep 11, 2014 at 11:43 AM, Aljoscha Krettek <[email protected]> > wrote: > > I added the Triangle Enumeration Examples, thanks Fabian. > > > > So far we have ported: WordCount, KMeans, ConnectedComponents, > > WebLogAnalysis, TransitiveClosureNaive, TriangleEnumerationNaive/Opt > > > > These are the examples people called dibs on: > > - PageRank (Fabian) > > - BatchGradientDescent (Márton) > > - ComputeEdgeDegrees (Hermann) > > > > Those are unclaimed (if I'm not mistaken): > > - The relational Stuff > > - LinearRegression > > > > On Wed, Sep 10, 2014 at 6:04 PM, Aljoscha Krettek <[email protected]> > wrote: > >> Thanks, I added it. I'll keep a running list of ported/unported > >> examples in my mails. I'll rename the java example package to examples > >> once the Scala API merge is done. > >> > >> I think the termination criterion is fine as it is. Just because Scala > >> enables functional programming doesn't mean it's always the best > >> choice. :D > >> > >> So far we have ported: WordCount, KMeans, ConnectedComponents, > >> WebLogAnalysis, TransitiveClosureNaive > >> > >> These are the examples people called dibs on: > >> - TriangleEnumration and PageRank (Fabian) > >> - BatchGradientDescent (Márton) > >> - ComputeEdgeDegrees (Hermann) > >> > >> Those are unclaimed (if I'm not mistaken): > >> - The relational Stuff > >> - LinearRegression > >> > >> Cheers, > >> Aljoscha > >> > >> On Wed, Sep 10, 2014 at 4:23 PM, Kostas Tzoumas <[email protected]> > wrote: > >>> Transitive closure here, I also added a termination criterion in the > Java > >>> version: > https://github.com/ktzoumas/incubator-flink/tree/tc-scala-example > >>> > >>> Perhaps you can make the termination criterion in Scala more > functional? > >>> > >>> I noticed that the examples package name is example.java but > examples.scala > >>> > >>> Kostas > >>> > >>> On Tue, Sep 9, 2014 at 6:12 PM, Kostas Tzoumas <[email protected]> > wrote: > >>>> > >>>> I'll take TransitiveClosure and PiEstimation (was not on your list). > >>>> > >>>> If nobody volunteers for the relational stuff I can take those as > well. > >>>> > >>>> How about removing the "RelationalQuery" from both Scala and Java? It > >>>> seems to be a proper subset of TPC-H Q3. Does it add some teaching > value on > >>>> top of TPC-H Q3? > >>>> > >>>> Kostas > >>>> > >>>> On Tue, Sep 9, 2014 at 5:57 PM, Aljoscha Krettek <[email protected] > > > >>>> wrote: > >>>>> > >>>>> Thanks, I added it, along with an ITCase. > >>>>> > >>>>> So far we have ported: WordCount, KMeans, ConnectedComponents, > >>>>> WebLogAnalysis > >>>>> > >>>>> These are the examples people called dibs on: > >>>>> - TriangleEnumration and PageRank (Fabian) > >>>>> - BatchGradientDescent (Márton) > >>>>> - ComputeEdgeDegrees (Hermann) > >>>>> > >>>>> Those are unclaimed (if I'm not mistaken): > >>>>> - TransitiveClosure > >>>>> - The relational Stuff > >>>>> - LinearRegression > >>>>> > >>>>> Cheers, > >>>>> Aljoscha > >>>>> > >>>>> On Tue, Sep 9, 2014 at 5:21 PM, Kostas Tzoumas <[email protected]> > >>>>> wrote: > >>>>> > WebLog here: > >>>>> > > >>>>> > > https://github.com/ktzoumas/incubator-flink/tree/webloganalysis-example-scala > >>>>> > > >>>>> > Do you need any more done? > >>>>> > > >>>>> > On Tue, Sep 9, 2014 at 3:08 PM, Aljoscha Krettek < > [email protected]> > >>>>> > wrote: > >>>>> > > >>>>> >> I added the ConnectedComponents Example from Vasia. > >>>>> >> > >>>>> >> Keep 'em coming, people. :D > >>>>> >> > >>>>> >> On Mon, Sep 8, 2014 at 6:07 PM, Fabian Hueske <[email protected] > > > >>>>> >> wrote: > >>>>> >> > Alright, will do. > >>>>> >> > Thanks! > >>>>> >> > > >>>>> >> > 2014-09-08 17:48 GMT+02:00 Aljoscha Krettek < > [email protected]>: > >>>>> >> > > >>>>> >> >> Ok people, executive decision. :D > >>>>> >> >> > >>>>> >> >> Please look at KMeansData.java and KMeans.scala. I'm storing > the > >>>>> >> >> data > >>>>> >> >> in multi-dimensional object arrays and then converting it to > the > >>>>> >> >> required Java or Scala objects. > >>>>> >> >> > >>>>> >> >> Also, I changed isEqualTo to equalTo to make it consistent > with the > >>>>> >> >> Java > >>>>> >> >> API. > >>>>> >> >> > >>>>> >> >> Regarding Join (and coGroup). There is no need for a keyword, > you > >>>>> >> >> can > >>>>> >> >> just write: > >>>>> >> >> > >>>>> >> >> left.join(right).where(0).equalTo(1) { (le, re) => new > MyResult(le, > >>>>> >> >> re) > >>>>> >> } > >>>>> >> >> > >>>>> >> >> On Mon, Sep 8, 2014 at 2:07 PM, Fabian Hueske < > [email protected]> > >>>>> >> wrote: > >>>>> >> >> > Aside from the DataSet issue, I also found an inconsistency > with > >>>>> >> >> > the > >>>>> >> Java > >>>>> >> >> > API. In Java join is done as: > >>>>> >> >> > > >>>>> >> >> > ds1.join(ds2).where(...).equalTo(...) > >>>>> >> >> > > >>>>> >> >> > where in the current Scala this is: > >>>>> >> >> > > >>>>> >> >> > ds1.join(d2).where(...).isEqualTo(...) > >>>>> >> >> > > >>>>> >> >> > isEqualTo() should be renamed to equalTo(), IMO. > >>>>> >> >> > Also, join (+cross and coGroup?) lacks the with() method > because > >>>>> >> "with" > >>>>> >> >> is > >>>>> >> >> > a keyword in Scala. Should be offer something similar for > Scala > >>>>> >> >> > or go > >>>>> >> >> with > >>>>> >> >> > map() on Tuple2(left, right)? > >>>>> >> >> > > >>>>> >> >> > 2014-09-08 13:51 GMT+02:00 Stephan Ewen <[email protected]>: > >>>>> >> >> > > >>>>> >> >> >> Instead of Strings, Object[][] would work as well. That is a > >>>>> >> >> >> generic > >>>>> >> >> >> representation of a Tuple. > >>>>> >> >> >> > >>>>> >> >> >> Alternatively, they could be stored as Java or Scala Tuples, > >>>>> >> >> >> with a > >>>>> >> >> generic > >>>>> >> >> >> utility method to convert between the two. > >>>>> >> >> >> > >>>>> >> >> >> On Mon, Sep 8, 2014 at 10:55 AM, Fabian Hueske > >>>>> >> >> >> <[email protected]> > >>>>> >> >> wrote: > >>>>> >> >> >> > >>>>> >> >> >> > Yeah, I ran into the same problem... > >>>>> >> >> >> > > >>>>> >> >> >> > +1 for using Strings and parsing them, but using the > >>>>> >> >> >> > CSVFormat > >>>>> >> won't > >>>>> >> >> >> work > >>>>> >> >> >> > because this is based on a FileInputFormat. > >>>>> >> >> >> > So we would need to parse the Strings manually... > >>>>> >> >> >> > > >>>>> >> >> >> > 2014-09-08 10:35 GMT+02:00 Aljoscha Krettek > >>>>> >> >> >> > <[email protected]>: > >>>>> >> >> >> > > >>>>> >> >> >> > > Hi, > >>>>> >> >> >> > > on second thought. Maybe we should just change all the > >>>>> >> >> >> > > example > >>>>> >> input > >>>>> >> >> >> > > data to strings and use CSV input formats in all the > >>>>> >> >> >> > > examples. > >>>>> >> What > >>>>> >> >> do > >>>>> >> >> >> > > you think? > >>>>> >> >> >> > > > >>>>> >> >> >> > > Cheers, > >>>>> >> >> >> > > Aljoscha > >>>>> >> >> >> > > > >>>>> >> >> >> > > On Mon, Sep 8, 2014 at 7:46 AM, Aljoscha Krettek < > >>>>> >> >> [email protected]> > >>>>> >> >> >> > > wrote: > >>>>> >> >> >> > > > Hi, > >>>>> >> >> >> > > > yes it's unfortunate that the data types are > incompatible. > >>>>> >> >> >> > > > I'm > >>>>> >> >> afraid > >>>>> >> >> >> > > > you have to to what you proposed: move the data to a > >>>>> >> >> >> > > > static > >>>>> >> field > >>>>> >> >> and > >>>>> >> >> >> > > > convert it in the getDefaultEdgeDataSet() method in > Scala. > >>>>> >> >> >> > > > It's > >>>>> >> >> not > >>>>> >> >> >> > > > nice, but copying would duplicate the data and make it > >>>>> >> >> >> > > > easier > >>>>> >> for > >>>>> >> >> it > >>>>> >> >> >> > > > to go out of sync in the Java and Scala versions. > >>>>> >> >> >> > > > > >>>>> >> >> >> > > > What do the others think? This will probably occur in > all > >>>>> >> >> >> > > > the > >>>>> >> >> >> examples. > >>>>> >> >> >> > > > > >>>>> >> >> >> > > > Cheers, > >>>>> >> >> >> > > > Aljoscha > >>>>> >> >> >> > > > > >>>>> >> >> >> > > > On Sun, Sep 7, 2014 at 10:04 PM, Vasiliki Kalavri > >>>>> >> >> >> > > > <[email protected]> wrote: > >>>>> >> >> >> > > >> Hey, > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> I have ported the Connected Components example, but > I am > >>>>> >> >> >> > > >> not > >>>>> >> sure > >>>>> >> >> >> how > >>>>> >> >> >> > to > >>>>> >> >> >> > > >> reuse the example input data from java-examples. > >>>>> >> >> >> > > >> In the ConnectedComponentsData class, the vertices > and > >>>>> >> >> >> > > >> edges > >>>>> >> data > >>>>> >> >> >> are > >>>>> >> >> >> > > >> produced by the methods getDefaultVertexDataSet() > >>>>> >> >> >> > > >> and getDefaultEdgeDataSet(), which take > >>>>> >> >> >> > > >> an org.apache.flink.api.java.ExecutionEnvironment as > >>>>> >> parameter. > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> One way is to provide public static fields (like in > the > >>>>> >> >> >> WordCountData > >>>>> >> >> >> > > >> class), but this introduces a conversion > >>>>> >> >> >> > > >> from org.apache.flink.api.java.tuple.Tuple2 to Scala > >>>>> >> >> >> > > >> tuple and > >>>>> >> >> from > >>>>> >> >> >> > > >> java.lang.Long to scala.Long and I guess this is an > >>>>> >> unnecessary > >>>>> >> >> >> > > complexity > >>>>> >> >> >> > > >> for an example (?). > >>>>> >> >> >> > > >> Another way is, of course, to copy the example data > in > >>>>> >> >> >> > > >> the > >>>>> >> Scala > >>>>> >> >> >> > > example. > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> Am I missing something here? > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> Thanks! > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> Cheers, > >>>>> >> >> >> > > >> V. > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >> On 5 September 2014 15:52, Aljoscha Krettek < > >>>>> >> [email protected] > >>>>> >> >> > > >>>>> >> >> >> > > wrote: > >>>>> >> >> >> > > >> > >>>>> >> >> >> > > >>> Alright, I updated my repo: > >>>>> >> >> >> > > >>> > >>>>> >> >> > https://github.com/aljoscha/incubator-flink/commits/scala-rework > >>>>> >> >> >> > > >>> > >>>>> >> >> >> > > >>> This now has a working WordCount example. It's > pretty > >>>>> >> >> >> > > >>> much a > >>>>> >> >> copy > >>>>> >> >> >> of > >>>>> >> >> >> > > >>> the Java example with some fixups for the syntax and > >>>>> >> >> >> > > >>> lambda > >>>>> >> >> >> > functions. > >>>>> >> >> >> > > >>> You'll also notice that I added the java-examples > as a > >>>>> >> >> dependency > >>>>> >> >> >> for > >>>>> >> >> >> > > >>> the scala-examples. I did this to reuse the example > >>>>> >> >> >> > > >>> input > >>>>> >> data. > >>>>> >> >> >> > > >>> > >>>>> >> >> >> > > >>> When you ported a program you can do a pull request > >>>>> >> >> >> > > >>> against > >>>>> >> my > >>>>> >> >> repo > >>>>> >> >> >> > > >>> and I will collect the examples. > >>>>> >> >> >> > > >>> > >>>>> >> >> >> > > >>> Happy coding. :D > >>>>> >> >> >> > > >>> > >>>>> >> >> >> > > >>> On Fri, Sep 5, 2014 at 12:19 PM, Hermann Gábor < > >>>>> >> >> >> [email protected] > >>>>> >> >> >> > > > >>>>> >> >> >> > > >>> wrote: > >>>>> >> >> >> > > >>> > +1 > >>>>> >> >> >> > > >>> > > >>>>> >> >> >> > > >>> > ComputeEdgeDegrees for me! > >>>>> >> >> >> > > >>> > > >>>>> >> >> >> > > >>> > > >>>>> >> >> >> > > >>> > On Fri, Sep 5, 2014 at 11:44 AM, Márton Balassi < > >>>>> >> >> >> > > >>> [email protected]> > >>>>> >> >> >> > > >>> > wrote: > >>>>> >> >> >> > > >>> > > >>>>> >> >> >> > > >>> >> +1 > >>>>> >> >> >> > > >>> >> > >>>>> >> >> >> > > >>> >> BatchGradientDescent for me :) > >>>>> >> >> >> > > >>> >> > >>>>> >> >> >> > > >>> >> > >>>>> >> >> >> > > >>> >> On Fri, Sep 5, 2014 at 11:15 AM, Kostas Tzoumas < > >>>>> >> >> >> > > [email protected]> > >>>>> >> >> >> > > >>> >> wrote: > >>>>> >> >> >> > > >>> >> > >>>>> >> >> >> > > >>> >> > +1 > >>>>> >> >> >> > > >>> >> > > >>>>> >> >> >> > > >>> >> > I go for WebLogAnalysis. > >>>>> >> >> >> > > >>> >> > > >>>>> >> >> >> > > >>> >> > My experience with Scala consists of going > through > >>>>> >> >> >> > > >>> >> > a > >>>>> >> >> tutorial > >>>>> >> >> >> so > >>>>> >> >> >> > > this > >>>>> >> >> >> > > >>> >> will > >>>>> >> >> >> > > >>> >> > be a good stress test both for me and the new > API > >>>>> >> >> >> > > >>> >> > :-) > >>>>> >> >> >> > > >>> >> > > >>>>> >> >> >> > > >>> >> > > >>>>> >> >> >> > > >>> >> > On Thu, Sep 4, 2014 at 9:09 PM, Vasiliki > Kalavri < > >>>>> >> >> >> > > >>> >> > [email protected]> > >>>>> >> >> >> > > >>> >> > wrote: > >>>>> >> >> >> > > >>> >> > > >>>>> >> >> >> > > >>> >> > > +1 for having other people implement the > >>>>> >> >> >> > > >>> >> > > examples! > >>>>> >> >> >> > > >>> >> > > Connected Components and Kmeans for me :) > >>>>> >> >> >> > > >>> >> > > > >>>>> >> >> >> > > >>> >> > > -V. > >>>>> >> >> >> > > >>> >> > > > >>>>> >> >> >> > > >>> >> > > > >>>>> >> >> >> > > >>> >> > > On 4 September 2014 21:03, Fabian Hueske < > >>>>> >> >> >> [email protected]> > >>>>> >> >> >> > > >>> wrote: > >>>>> >> >> >> > > >>> >> > > > >>>>> >> >> >> > > >>> >> > > > I go for TriangleEnumeration and PageRank. > >>>>> >> >> >> > > >>> >> > > > > >>>>> >> >> >> > > >>> >> > > > Let's also do the examples similar to the > Java > >>>>> >> >> examples: > >>>>> >> >> >> > > >>> >> > > > - running out-of-the-box without parameters > >>>>> >> >> >> > > >>> >> > > > - parameters for external data > >>>>> >> >> >> > > >>> >> > > > - follow a similar code structure > >>>>> >> >> >> > > >>> >> > > > > >>>>> >> >> >> > > >>> >> > > > > >>>>> >> >> >> > > >>> >> > > > > >>>>> >> >> >> > > >>> >> > > > 2014-09-04 20:56 GMT+02:00 Aljoscha > Krettek < > >>>>> >> >> >> > > [email protected] > >>>>> >> >> >> > > >>> >: > >>>>> >> >> >> > > >>> >> > > > > >>>>> >> >> >> > > >>> >> > > > > Will do, then people can reserve their > >>>>> >> >> >> > > >>> >> > > > > favourite > >>>>> >> >> >> examples > >>>>> >> >> >> > > here. > >>>>> >> >> >> > > >>> >> > > > > > >>>>> >> >> >> > > >>> >> > > > > On Thu, Sep 4, 2014 at 8:55 PM, Fabian > Hueske > >>>>> >> >> >> > > >>> >> > > > > < > >>>>> >> >> >> > > >>> [email protected]> > >>>>> >> >> >> > > >>> >> > > > wrote: > >>>>> >> >> >> > > >>> >> > > > > > Hi, > >>>>> >> >> >> > > >>> >> > > > > > > >>>>> >> >> >> > > >>> >> > > > > > I think having examples implemented by > >>>>> >> >> >> > > >>> >> > > > > > different > >>>>> >> >> >> people > >>>>> >> >> >> > > >>> proved to > >>>>> >> >> >> > > >>> >> > be > >>>>> >> >> >> > > >>> >> > > > > > valuable in the past. > >>>>> >> >> >> > > >>> >> > > > > > I'd help with two or three examples. > >>>>> >> >> >> > > >>> >> > > > > > > >>>>> >> >> >> > > >>> >> > > > > > It might be helpful if you'd port a > simple > >>>>> >> >> >> > > >>> >> > > > > > first > >>>>> >> >> one > >>>>> >> >> >> > such > >>>>> >> >> >> > > as > >>>>> >> >> >> > > >>> >> > > WordCount. > >>>>> >> >> >> > > >>> >> > > > > > > >>>>> >> >> >> > > >>> >> > > > > > Fabian > >>>>> >> >> >> > > >>> >> > > > > > > >>>>> >> >> >> > > >>> >> > > > > > > >>>>> >> >> >> > > >>> >> > > > > > 2014-09-04 18:47 GMT+02:00 Aljoscha > Krettek > >>>>> >> >> >> > > >>> >> > > > > > < > >>>>> >> >> >> > > >>> [email protected] > >>>>> >> >> >> > > >>> >> >: > >>>>> >> >> >> > > >>> >> > > > > > > >>>>> >> >> >> > > >>> >> > > > > >> Hi, > >>>>> >> >> >> > > >>> >> > > > > >> I have a working rewrite of the Scala > API > >>>>> >> >> >> > > >>> >> > > > > >> here: > >>>>> >> >> >> > > >>> >> > > > > >> > >>>>> >> >> >> > > >>> >> > >>>>> >> >> >> > https://github.com/aljoscha/incubator-flink/commits/scala-rework > >>>>> >> >> >> > > >>> >> > > > > >> > >>>>> >> >> >> > > >>> >> > > > > >> I'm hoping that I'll only have to > write > >>>>> >> >> >> > > >>> >> > > > > >> the > >>>>> >> tests > >>>>> >> >> and > >>>>> >> >> >> > > port > >>>>> >> >> >> > > >>> the > >>>>> >> >> >> > > >>> >> > > > > >> examples. Do you think it makes sense > to > >>>>> >> >> >> > > >>> >> > > > > >> let > >>>>> >> other > >>>>> >> >> >> > people > >>>>> >> >> >> > > >>> port > >>>>> >> >> >> > > >>> >> the > >>>>> >> >> >> > > >>> >> > > > > >> examples, so that someone else uses > it and > >>>>> >> maybe > >>>>> >> >> >> > notices > >>>>> >> >> >> > > some > >>>>> >> >> >> > > >>> >> > quirks > >>>>> >> >> >> > > >>> >> > > > > >> in the API? > >>>>> >> >> >> > > >>> >> > > > > >> > >>>>> >> >> >> > > >>> >> > > > > >> Cheers, > >>>>> >> >> >> > > >>> >> > > > > >> Aljoscha > >>>>> >> >> >> > > >>> >> > > > > >> > >>>>> >> >> >> > > >>> >> > > > > > >>>>> >> >> >> > > >>> >> > > > > >>>>> >> >> >> > > >>> >> > > > >>>>> >> >> >> > > >>> >> > > >>>>> >> >> >> > > >>> >> > >>>>> >> >> >> > > >>> > >>>>> >> >> >> > > > >>>>> >> >> >> > > >>>>> >> >> >> > >>>>> >> >> > >>>>> >> > >>>> > >>>> > >>> >
