Also, you can use CaseClasses directly as the type for CSV input. So instead of reading it as tuples and then having a mapper that maps to your case classes you can use:
env.readCsv[Edge](...) On Fri, Sep 12, 2014 at 11:43 AM, Aljoscha Krettek <[email protected]> wrote: > I added support for specifying keys by name for CaseClasses. Check out > the PageRank and TriangleEnumeration examples to see it in action. > > @Kostas: I think you could use them for the TPC-H examples. > > On Fri, Sep 12, 2014 at 7:23 AM, Aljoscha Krettek <[email protected]> wrote: >> Yes, that would allow list comprehensions. It would be possible to >> have the Collection signature for join (and coGroup), i.e.: >> >> apply[R]((T, O) => TraversableOnce[O]): DataSet[O] >> >> (T and O are the left and right input type, R is result type) >> >> Then you can return collections and still return an option, as in: >> >> a.join(b).where(0).equalTo(0) { (l, r) => if (r > ...) Some(l) else None } >> >> Because there is an implicit conversion from Options to a Collection. >> This will always wrap the return value in a List with only one value. >> I'm not sure we want the overhead here. I'm also not sure whether we >> want the overhead of always having to use an Option even though the >> join always returns a value. >> >> What do you think? >> >> On Thu, Sep 11, 2014 at 11:22 PM, Fabian Hueske <[email protected]> wrote: >>> Hmmm, tricky question... >>> How about the Option for Join as this is a tuple-wise operation and the >>> Collection for Cogroup which is group-wise? >>> Could we in that case use list comprehensions in Cogroup functions? >>> >>> Or is that too much mixing? >>> >>> 2014-09-11 23:00 GMT+02:00 Aljoscha Krettek <[email protected]>: >>> >>>> I didn't look at the example either. >>>> >>>> Addings collections is easy, it's just that we can either have >>>> Collections or the Option, not both. >>>> >>>> For the coding style I followed this: >>>> https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide, >>>> which itself is based on this: http://docs.scala-lang.org/style/. It >>>> is different from the Java Code Guidelines we have in place, yes. >>>> >>>> On Thu, Sep 11, 2014 at 10:10 PM, Fabian Hueske <[email protected]> >>>> wrote: >>>> > I haven't looked at the LineRank example in detail, but if you think that >>>> > it adds something new to the examples collection, we can certainly port >>>> it >>>> > also to Java. >>>> > I think the Option and Collector return types are sufficient right now >>>> but >>>> > if Collections are easy to add, go for it. ;-) >>>> > >>>> > Great that the Scala primitives are working! Also thanks for adding >>>> > genSequence and adapting my examples. >>>> > Btw. does the codestyle not apply for Scala files or do we have a >>>> different >>>> > there? >>>> > >>>> > 2014-09-11 17:55 GMT+02:00 Aljoscha Krettek <[email protected]>: >>>> > >>>> >> What about the LineRank example? We had that in Scala but never had a >>>> >> Java Example. >>>> >> >>>> >> On Thu, Sep 11, 2014 at 5:51 PM, Aljoscha Krettek <[email protected]> >>>> >> wrote: >>>> >> > Yes, I like that. For the ITCases I always just copied the Java >>>> ITCase. >>>> >> > >>>> >> > The only examples that are missing now are LinearRegression and the >>>> >> > relational stuff. >>>> >> > >>>> >> > On Thu, Sep 11, 2014 at 5:48 PM, Fabian Hueske <[email protected]> >>>> >> wrote: >>>> >> >> I just removed the old CountEdgeDegrees example. >>>> >> >> That was a preprocessing step for the TriangleEnumeration, and is now >>>> >> part >>>> >> >> of the new TriangleEnumerationOpt example. >>>> >> >> So I guess, we don't need to port that one. As I said before, I'd >>>> >> prefer to >>>> >> >> keep Java and Scala examples in sync. >>>> >> >> >>>> >> >> Cheers, Fabian >>>> >> >> >>>> >> >> 2014-09-11 17:40 GMT+02:00 Aljoscha Krettek <[email protected]>: >>>> >> >> >>>> >> >>> I added the PageRank example, thanks again fabian. :D >>>> >> >>> >>>> >> >>> Regarding the other stuff: >>>> >> >>> - There is a comment in DataSet.scala about including >>>> >> >>> org.apache.flink.api.scala._ because of the TypeInformation. >>>> >> >>> - I added generateSequence to ExecutionEnvironment. >>>> >> >>> - It is possible to use Scala Primitives in Array, I noticed it >>>> while >>>> >> >>> writing the tests, you probably had an older version of the code. >>>> >> >>> - Yes, using List and other Interfaces is not possible, this is >>>> also >>>> >> >>> a restriction in the Java API. >>>> >> >>> >>>> >> >>> What do you think about the interface of join and coGroup? Right >>>> now, >>>> >> >>> you can either use a lambda that returns an Option or the lambda >>>> with >>>> >> >>> the Collector. Originally I wanted to have also have a lambda that >>>> >> >>> returns a Collection, but due to type erasure this has the same type >>>> >> >>> as the lambda with the Option so I couldn't use it. There is an >>>> >> >>> implicit conversion from Option to a Collection, so I could change >>>> it >>>> >> >>> without breaking the examples we have now. What do you think? >>>> >> >>> >>>> >> >>> So far we have ported: WordCount, KMeans, ConnectedComponents, >>>> >> >>> WebLogAnalysis, TransitiveClosureNaive, >>>> TriangleEnumerationNaive/Opt, >>>> >> >>> PageRank >>>> >> >>> >>>> >> >>> These are the examples people called dibs on: >>>> >> >>> - BatchGradientDescent (Márton) (Should be a port of >>>> LinearRegression >>>> >> >>> Example from Java) >>>> >> >>> - ComputeEdgeDegrees (Hermann) >>>> >> >>> >>>> >> >>> Those are unclaimed (if I'm not mistaken): >>>> >> >>> - The relational Stuff >>>> >> >>> >>>> >> >>> On Thu, Sep 11, 2014 at 3:06 PM, Stephan Ewen <[email protected]> >>>> >> wrote: >>>> >> >>> > +1 for removing RelationQuery >>>> >> >>> > >>>> >> >>> > On Thu, Sep 11, 2014 at 3:04 PM, Aljoscha Krettek < >>>> >> [email protected]> >>>> >> >>> > wrote: >>>> >> >>> > >>>> >> >>> >> By the way, what was called BatchGradientDescent in the Scala >>>> >> examples >>>> >> >>> >> should be replaced by a port of the LinearRegression Example from >>>> >> >>> >> Java. I had them as two separate examples earlier. >>>> >> >>> >> >>>> >> >>> >> What about RelationalQuery and TPC-H-Q3. Any thoughts about >>>> removing >>>> >> >>> >> RelationalQuery? >>>> >> >>> >> >>>> >> >>> >> On Thu, Sep 11, 2014 at 11:43 AM, Aljoscha Krettek < >>>> >> [email protected] >>>> >> >>> > >>>> >> >>> >> wrote: >>>> >> >>> >> > I added the Triangle Enumeration Examples, thanks Fabian. >>>> >> >>> >> > >>>> >> >>> >> > So far we have ported: WordCount, KMeans, ConnectedComponents, >>>> >> >>> >> > WebLogAnalysis, TransitiveClosureNaive, >>>> >> TriangleEnumerationNaive/Opt >>>> >> >>> >> > >>>> >> >>> >> > These are the examples people called dibs on: >>>> >> >>> >> > - PageRank (Fabian) >>>> >> >>> >> > - BatchGradientDescent (Márton) >>>> >> >>> >> > - ComputeEdgeDegrees (Hermann) >>>> >> >>> >> > >>>> >> >>> >> > Those are unclaimed (if I'm not mistaken): >>>> >> >>> >> > - The relational Stuff >>>> >> >>> >> > - LinearRegression >>>> >> >>> >> > >>>> >> >>> >> > On Wed, Sep 10, 2014 at 6:04 PM, Aljoscha Krettek < >>>> >> >>> [email protected]> >>>> >> >>> >> wrote: >>>> >> >>> >> >> Thanks, I added it. I'll keep a running list of >>>> ported/unported >>>> >> >>> >> >> examples in my mails. I'll rename the java example package to >>>> >> >>> examples >>>> >> >>> >> >> once the Scala API merge is done. >>>> >> >>> >> >> >>>> >> >>> >> >> I think the termination criterion is fine as it is. Just >>>> because >>>> >> >>> Scala >>>> >> >>> >> >> enables functional programming doesn't mean it's always the >>>> best >>>> >> >>> >> >> choice. :D >>>> >> >>> >> >> >>>> >> >>> >> >> So far we have ported: WordCount, KMeans, ConnectedComponents, >>>> >> >>> >> >> WebLogAnalysis, TransitiveClosureNaive >>>> >> >>> >> >> >>>> >> >>> >> >> These are the examples people called dibs on: >>>> >> >>> >> >> - TriangleEnumration and PageRank (Fabian) >>>> >> >>> >> >> - BatchGradientDescent (Márton) >>>> >> >>> >> >> - ComputeEdgeDegrees (Hermann) >>>> >> >>> >> >> >>>> >> >>> >> >> Those are unclaimed (if I'm not mistaken): >>>> >> >>> >> >> - The relational Stuff >>>> >> >>> >> >> - LinearRegression >>>> >> >>> >> >> >>>> >> >>> >> >> Cheers, >>>> >> >>> >> >> Aljoscha >>>> >> >>> >> >> >>>> >> >>> >> >> On Wed, Sep 10, 2014 at 4:23 PM, Kostas Tzoumas < >>>> >> [email protected] >>>> >> >>> > >>>> >> >>> >> wrote: >>>> >> >>> >> >>> Transitive closure here, I also added a termination criterion >>>> >> in the >>>> >> >>> >> Java >>>> >> >>> >> >>> version: >>>> >> >>> >> >>>> https://github.com/ktzoumas/incubator-flink/tree/tc-scala-example >>>> >> >>> >> >>> >>>> >> >>> >> >>> Perhaps you can make the termination criterion in Scala more >>>> >> >>> >> functional? >>>> >> >>> >> >>> >>>> >> >>> >> >>> I noticed that the examples package name is example.java but >>>> >> >>> >> examples.scala >>>> >> >>> >> >>> >>>> >> >>> >> >>> Kostas >>>> >> >>> >> >>> >>>> >> >>> >> >>> On Tue, Sep 9, 2014 at 6:12 PM, Kostas Tzoumas < >>>> >> [email protected] >>>> >> >>> > >>>> >> >>> >> wrote: >>>> >> >>> >> >>>> >>>> >> >>> >> >>>> I'll take TransitiveClosure and PiEstimation (was not on >>>> your >>>> >> >>> list). >>>> >> >>> >> >>>> >>>> >> >>> >> >>>> If nobody volunteers for the relational stuff I can take >>>> those >>>> >> as >>>> >> >>> >> well. >>>> >> >>> >> >>>> >>>> >> >>> >> >>>> How about removing the "RelationalQuery" from both Scala and >>>> >> Java? >>>> >> >>> It >>>> >> >>> >> >>>> seems to be a proper subset of TPC-H Q3. Does it add some >>>> >> teaching >>>> >> >>> >> value on >>>> >> >>> >> >>>> top of TPC-H Q3? >>>> >> >>> >> >>>> >>>> >> >>> >> >>>> Kostas >>>> >> >>> >> >>>> >>>> >> >>> >> >>>> On Tue, Sep 9, 2014 at 5:57 PM, Aljoscha Krettek < >>>> >> >>> [email protected] >>>> >> >>> >> > >>>> >> >>> >> >>>> wrote: >>>> >> >>> >> >>>>> >>>> >> >>> >> >>>>> Thanks, I added it, along with an ITCase. >>>> >> >>> >> >>>>> >>>> >> >>> >> >>>>> So far we have ported: WordCount, KMeans, >>>> ConnectedComponents, >>>> >> >>> >> >>>>> WebLogAnalysis >>>> >> >>> >> >>>>> >>>> >> >>> >> >>>>> These are the examples people called dibs on: >>>> >> >>> >> >>>>> - TriangleEnumration and PageRank (Fabian) >>>> >> >>> >> >>>>> - BatchGradientDescent (Márton) >>>> >> >>> >> >>>>> - ComputeEdgeDegrees (Hermann) >>>> >> >>> >> >>>>> >>>> >> >>> >> >>>>> Those are unclaimed (if I'm not mistaken): >>>> >> >>> >> >>>>> - TransitiveClosure >>>> >> >>> >> >>>>> - The relational Stuff >>>> >> >>> >> >>>>> - LinearRegression >>>> >> >>> >> >>>>> >>>> >> >>> >> >>>>> Cheers, >>>> >> >>> >> >>>>> Aljoscha >>>> >> >>> >> >>>>> >>>> >> >>> >> >>>>> On Tue, Sep 9, 2014 at 5:21 PM, Kostas Tzoumas < >>>> >> >>> [email protected]> >>>> >> >>> >> >>>>> wrote: >>>> >> >>> >> >>>>> > WebLog here: >>>> >> >>> >> >>>>> > >>>> >> >>> >> >>>>> > >>>> >> >>> >> >>>> >> >>> >>>> >> >>>> https://github.com/ktzoumas/incubator-flink/tree/webloganalysis-example-scala >>>> >> >>> >> >>>>> > >>>> >> >>> >> >>>>> > Do you need any more done? >>>> >> >>> >> >>>>> > >>>> >> >>> >> >>>>> > On Tue, Sep 9, 2014 at 3:08 PM, Aljoscha Krettek < >>>> >> >>> >> [email protected]> >>>> >> >>> >> >>>>> > wrote: >>>> >> >>> >> >>>>> > >>>> >> >>> >> >>>>> >> I added the ConnectedComponents Example from Vasia. >>>> >> >>> >> >>>>> >> >>>> >> >>> >> >>>>> >> Keep 'em coming, people. :D >>>> >> >>> >> >>>>> >> >>>> >> >>> >> >>>>> >> On Mon, Sep 8, 2014 at 6:07 PM, Fabian Hueske < >>>> >> >>> [email protected] >>>> >> >>> >> > >>>> >> >>> >> >>>>> >> wrote: >>>> >> >>> >> >>>>> >> > Alright, will do. >>>> >> >>> >> >>>>> >> > Thanks! >>>> >> >>> >> >>>>> >> > >>>> >> >>> >> >>>>> >> > 2014-09-08 17:48 GMT+02:00 Aljoscha Krettek < >>>> >> >>> >> [email protected]>: >>>> >> >>> >> >>>>> >> > >>>> >> >>> >> >>>>> >> >> Ok people, executive decision. :D >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> >>>>> >> >> Please look at KMeansData.java and KMeans.scala. I'm >>>> >> storing >>>> >> >>> >> the >>>> >> >>> >> >>>>> >> >> data >>>> >> >>> >> >>>>> >> >> in multi-dimensional object arrays and then >>>> converting >>>> >> it to >>>> >> >>> >> the >>>> >> >>> >> >>>>> >> >> required Java or Scala objects. >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> >>>>> >> >> Also, I changed isEqualTo to equalTo to make it >>>> >> consistent >>>> >> >>> >> with the >>>> >> >>> >> >>>>> >> >> Java >>>> >> >>> >> >>>>> >> >> API. >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> >>>>> >> >> Regarding Join (and coGroup). There is no need for a >>>> >> >>> keyword, >>>> >> >>> >> you >>>> >> >>> >> >>>>> >> >> can >>>> >> >>> >> >>>>> >> >> just write: >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> >>>>> >> >> left.join(right).where(0).equalTo(1) { (le, re) => >>>> new >>>> >> >>> >> MyResult(le, >>>> >> >>> >> >>>>> >> >> re) >>>> >> >>> >> >>>>> >> } >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> >>>>> >> >> On Mon, Sep 8, 2014 at 2:07 PM, Fabian Hueske < >>>> >> >>> >> [email protected]> >>>> >> >>> >> >>>>> >> wrote: >>>> >> >>> >> >>>>> >> >> > Aside from the DataSet issue, I also found an >>>> >> >>> inconsistency >>>> >> >>> >> with >>>> >> >>> >> >>>>> >> >> > the >>>> >> >>> >> >>>>> >> Java >>>> >> >>> >> >>>>> >> >> > API. In Java join is done as: >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> > ds1.join(ds2).where(...).equalTo(...) >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> > where in the current Scala this is: >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> > ds1.join(d2).where(...).isEqualTo(...) >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> > isEqualTo() should be renamed to equalTo(), IMO. >>>> >> >>> >> >>>>> >> >> > Also, join (+cross and coGroup?) lacks the with() >>>> >> method >>>> >> >>> >> because >>>> >> >>> >> >>>>> >> "with" >>>> >> >>> >> >>>>> >> >> is >>>> >> >>> >> >>>>> >> >> > a keyword in Scala. Should be offer something >>>> similar >>>> >> for >>>> >> >>> >> Scala >>>> >> >>> >> >>>>> >> >> > or go >>>> >> >>> >> >>>>> >> >> with >>>> >> >>> >> >>>>> >> >> > map() on Tuple2(left, right)? >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> > 2014-09-08 13:51 GMT+02:00 Stephan Ewen < >>>> >> [email protected] >>>> >> >>> >: >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> >> Instead of Strings, Object[][] would work as well. >>>> >> That >>>> >> >>> is a >>>> >> >>> >> >>>>> >> >> >> generic >>>> >> >>> >> >>>>> >> >> >> representation of a Tuple. >>>> >> >>> >> >>>>> >> >> >> >>>> >> >>> >> >>>>> >> >> >> Alternatively, they could be stored as Java or >>>> Scala >>>> >> >>> Tuples, >>>> >> >>> >> >>>>> >> >> >> with a >>>> >> >>> >> >>>>> >> >> generic >>>> >> >>> >> >>>>> >> >> >> utility method to convert between the two. >>>> >> >>> >> >>>>> >> >> >> >>>> >> >>> >> >>>>> >> >> >> On Mon, Sep 8, 2014 at 10:55 AM, Fabian Hueske >>>> >> >>> >> >>>>> >> >> >> <[email protected]> >>>> >> >>> >> >>>>> >> >> wrote: >>>> >> >>> >> >>>>> >> >> >> >>>> >> >>> >> >>>>> >> >> >> > Yeah, I ran into the same problem... >>>> >> >>> >> >>>>> >> >> >> > >>>> >> >>> >> >>>>> >> >> >> > +1 for using Strings and parsing them, but >>>> using >>>> >> the >>>> >> >>> >> >>>>> >> >> >> > CSVFormat >>>> >> >>> >> >>>>> >> won't >>>> >> >>> >> >>>>> >> >> >> work >>>> >> >>> >> >>>>> >> >> >> > because this is based on a FileInputFormat. >>>> >> >>> >> >>>>> >> >> >> > So we would need to parse the Strings >>>> manually... >>>> >> >>> >> >>>>> >> >> >> > >>>> >> >>> >> >>>>> >> >> >> > 2014-09-08 10:35 GMT+02:00 Aljoscha Krettek >>>> >> >>> >> >>>>> >> >> >> > <[email protected]>: >>>> >> >>> >> >>>>> >> >> >> > >>>> >> >>> >> >>>>> >> >> >> > > Hi, >>>> >> >>> >> >>>>> >> >> >> > > on second thought. Maybe we should just change >>>> >> all >>>> >> >>> the >>>> >> >>> >> >>>>> >> >> >> > > example >>>> >> >>> >> >>>>> >> input >>>> >> >>> >> >>>>> >> >> >> > > data to strings and use CSV input formats in >>>> all >>>> >> the >>>> >> >>> >> >>>>> >> >> >> > > examples. >>>> >> >>> >> >>>>> >> What >>>> >> >>> >> >>>>> >> >> do >>>> >> >>> >> >>>>> >> >> >> > > you think? >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> > > Cheers, >>>> >> >>> >> >>>>> >> >> >> > > Aljoscha >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> > > On Mon, Sep 8, 2014 at 7:46 AM, Aljoscha >>>> Krettek >>>> >> < >>>> >> >>> >> >>>>> >> >> [email protected]> >>>> >> >>> >> >>>>> >> >> >> > > wrote: >>>> >> >>> >> >>>>> >> >> >> > > > Hi, >>>> >> >>> >> >>>>> >> >> >> > > > yes it's unfortunate that the data types are >>>> >> >>> >> incompatible. >>>> >> >>> >> >>>>> >> >> >> > > > I'm >>>> >> >>> >> >>>>> >> >> afraid >>>> >> >>> >> >>>>> >> >> >> > > > you have to to what you proposed: move the >>>> >> data to >>>> >> >>> a >>>> >> >>> >> >>>>> >> >> >> > > > static >>>> >> >>> >> >>>>> >> field >>>> >> >>> >> >>>>> >> >> and >>>> >> >>> >> >>>>> >> >> >> > > > convert it in the getDefaultEdgeDataSet() >>>> >> method in >>>> >> >>> >> Scala. >>>> >> >>> >> >>>>> >> >> >> > > > It's >>>> >> >>> >> >>>>> >> >> not >>>> >> >>> >> >>>>> >> >> >> > > > nice, but copying would duplicate the data >>>> and >>>> >> >>> make it >>>> >> >>> >> >>>>> >> >> >> > > > easier >>>> >> >>> >> >>>>> >> for >>>> >> >>> >> >>>>> >> >> it >>>> >> >>> >> >>>>> >> >> >> > > > to go out of sync in the Java and Scala >>>> >> versions. >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > > What do the others think? This will probably >>>> >> occur >>>> >> >>> in >>>> >> >>> >> all >>>> >> >>> >> >>>>> >> >> >> > > > the >>>> >> >>> >> >>>>> >> >> >> examples. >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > > Cheers, >>>> >> >>> >> >>>>> >> >> >> > > > Aljoscha >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > > On Sun, Sep 7, 2014 at 10:04 PM, Vasiliki >>>> >> Kalavri >>>> >> >>> >> >>>>> >> >> >> > > > <[email protected]> wrote: >>>> >> >>> >> >>>>> >> >> >> > > >> Hey, >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> I have ported the Connected Components >>>> >> example, >>>> >> >>> but >>>> >> >>> >> I am >>>> >> >>> >> >>>>> >> >> >> > > >> not >>>> >> >>> >> >>>>> >> sure >>>> >> >>> >> >>>>> >> >> >> how >>>> >> >>> >> >>>>> >> >> >> > to >>>> >> >>> >> >>>>> >> >> >> > > >> reuse the example input data from >>>> >> java-examples. >>>> >> >>> >> >>>>> >> >> >> > > >> In the ConnectedComponentsData class, the >>>> >> vertices >>>> >> >>> >> and >>>> >> >>> >> >>>>> >> >> >> > > >> edges >>>> >> >>> >> >>>>> >> data >>>> >> >>> >> >>>>> >> >> >> are >>>> >> >>> >> >>>>> >> >> >> > > >> produced by the methods >>>> >> getDefaultVertexDataSet() >>>> >> >>> >> >>>>> >> >> >> > > >> and getDefaultEdgeDataSet(), which take >>>> >> >>> >> >>>>> >> >> >> > > >> an >>>> >> org.apache.flink.api.java.ExecutionEnvironment >>>> >> >>> as >>>> >> >>> >> >>>>> >> parameter. >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> One way is to provide public static fields >>>> >> (like >>>> >> >>> in >>>> >> >>> >> the >>>> >> >>> >> >>>>> >> >> >> WordCountData >>>> >> >>> >> >>>>> >> >> >> > > >> class), but this introduces a conversion >>>> >> >>> >> >>>>> >> >> >> > > >> from >>>> org.apache.flink.api.java.tuple.Tuple2 to >>>> >> >>> Scala >>>> >> >>> >> >>>>> >> >> >> > > >> tuple and >>>> >> >>> >> >>>>> >> >> from >>>> >> >>> >> >>>>> >> >> >> > > >> java.lang.Long to scala.Long and I guess >>>> this >>>> >> is >>>> >> >>> an >>>> >> >>> >> >>>>> >> unnecessary >>>> >> >>> >> >>>>> >> >> >> > > complexity >>>> >> >>> >> >>>>> >> >> >> > > >> for an example (?). >>>> >> >>> >> >>>>> >> >> >> > > >> Another way is, of course, to copy the >>>> example >>>> >> >>> data >>>> >> >>> >> in >>>> >> >>> >> >>>>> >> >> >> > > >> the >>>> >> >>> >> >>>>> >> Scala >>>> >> >>> >> >>>>> >> >> >> > > example. >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> Am I missing something here? >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> Thanks! >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> Cheers, >>>> >> >>> >> >>>>> >> >> >> > > >> V. >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >> On 5 September 2014 15:52, Aljoscha >>>> Krettek < >>>> >> >>> >> >>>>> >> [email protected] >>>> >> >>> >> >>>>> >> >> > >>>> >> >>> >> >>>>> >> >> >> > > wrote: >>>> >> >>> >> >>>>> >> >> >> > > >> >>>> >> >>> >> >>>>> >> >> >> > > >>> Alright, I updated my repo: >>>> >> >>> >> >>>>> >> >> >> > > >>> >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> https://github.com/aljoscha/incubator-flink/commits/scala-rework >>>> >> >>> >> >>>>> >> >> >> > > >>> >>>> >> >>> >> >>>>> >> >> >> > > >>> This now has a working WordCount example. >>>> >> It's >>>> >> >>> >> pretty >>>> >> >>> >> >>>>> >> >> >> > > >>> much a >>>> >> >>> >> >>>>> >> >> copy >>>> >> >>> >> >>>>> >> >> >> of >>>> >> >>> >> >>>>> >> >> >> > > >>> the Java example with some fixups for the >>>> >> syntax >>>> >> >>> and >>>> >> >>> >> >>>>> >> >> >> > > >>> lambda >>>> >> >>> >> >>>>> >> >> >> > functions. >>>> >> >>> >> >>>>> >> >> >> > > >>> You'll also notice that I added the >>>> >> java-examples >>>> >> >>> >> as a >>>> >> >>> >> >>>>> >> >> dependency >>>> >> >>> >> >>>>> >> >> >> for >>>> >> >>> >> >>>>> >> >> >> > > >>> the scala-examples. I did this to reuse >>>> the >>>> >> >>> example >>>> >> >>> >> >>>>> >> >> >> > > >>> input >>>> >> >>> >> >>>>> >> data. >>>> >> >>> >> >>>>> >> >> >> > > >>> >>>> >> >>> >> >>>>> >> >> >> > > >>> When you ported a program you can do a >>>> pull >>>> >> >>> request >>>> >> >>> >> >>>>> >> >> >> > > >>> against >>>> >> >>> >> >>>>> >> my >>>> >> >>> >> >>>>> >> >> repo >>>> >> >>> >> >>>>> >> >> >> > > >>> and I will collect the examples. >>>> >> >>> >> >>>>> >> >> >> > > >>> >>>> >> >>> >> >>>>> >> >> >> > > >>> Happy coding. :D >>>> >> >>> >> >>>>> >> >> >> > > >>> >>>> >> >>> >> >>>>> >> >> >> > > >>> On Fri, Sep 5, 2014 at 12:19 PM, Hermann >>>> >> Gábor < >>>> >> >>> >> >>>>> >> >> >> [email protected] >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> wrote: >>>> >> >>> >> >>>>> >> >> >> > > >>> > +1 >>>> >> >>> >> >>>>> >> >> >> > > >>> > >>>> >> >>> >> >>>>> >> >> >> > > >>> > ComputeEdgeDegrees for me! >>>> >> >>> >> >>>>> >> >> >> > > >>> > >>>> >> >>> >> >>>>> >> >> >> > > >>> > >>>> >> >>> >> >>>>> >> >> >> > > >>> > On Fri, Sep 5, 2014 at 11:44 AM, Márton >>>> >> >>> Balassi < >>>> >> >>> >> >>>>> >> >> >> > > >>> [email protected]> >>>> >> >>> >> >>>>> >> >> >> > > >>> > wrote: >>>> >> >>> >> >>>>> >> >> >> > > >>> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> +1 >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> BatchGradientDescent for me :) >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> On Fri, Sep 5, 2014 at 11:15 AM, Kostas >>>> >> >>> Tzoumas < >>>> >> >>> >> >>>>> >> >> >> > > [email protected]> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> wrote: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > +1 >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > I go for WebLogAnalysis. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > My experience with Scala consists of >>>> >> going >>>> >> >>> >> through >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > a >>>> >> >>> >> >>>>> >> >> tutorial >>>> >> >>> >> >>>>> >> >> >> so >>>> >> >>> >> >>>>> >> >> >> > > this >>>> >> >>> >> >>>>> >> >> >> > > >>> >> will >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > be a good stress test both for me and >>>> >> the >>>> >> >>> new >>>> >> >>> >> API >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > :-) >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > On Thu, Sep 4, 2014 at 9:09 PM, >>>> Vasiliki >>>> >> >>> >> Kalavri < >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > [email protected]> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > wrote: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > +1 for having other people >>>> implement >>>> >> the >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > examples! >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > Connected Components and Kmeans for >>>> >> me :) >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > -V. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > On 4 September 2014 21:03, Fabian >>>> >> Hueske < >>>> >> >>> >> >>>>> >> >> >> [email protected]> >>>> >> >>> >> >>>>> >> >> >> > > >>> wrote: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > I go for TriangleEnumeration and >>>> >> >>> PageRank. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > Let's also do the examples >>>> similar >>>> >> to >>>> >> >>> the >>>> >> >>> >> Java >>>> >> >>> >> >>>>> >> >> examples: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > - running out-of-the-box without >>>> >> >>> parameters >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > - parameters for external data >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > - follow a similar code structure >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > 2014-09-04 20:56 GMT+02:00 >>>> Aljoscha >>>> >> >>> >> Krettek < >>>> >> >>> >> >>>>> >> >> >> > > [email protected] >>>> >> >>> >> >>>>> >> >> >> > > >>> >: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > Will do, then people can >>>> reserve >>>> >> their >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > favourite >>>> >> >>> >> >>>>> >> >> >> examples >>>> >> >>> >> >>>>> >> >> >> > > here. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > On Thu, Sep 4, 2014 at 8:55 PM, >>>> >> Fabian >>>> >> >>> >> Hueske >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > < >>>> >> >>> >> >>>>> >> >> >> > > >>> [email protected]> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > wrote: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > Hi, >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > I think having examples >>>> >> implemented >>>> >> >>> by >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > different >>>> >> >>> >> >>>>> >> >> >> people >>>> >> >>> >> >>>>> >> >> >> > > >>> proved to >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > be >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > valuable in the past. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > I'd help with two or three >>>> >> examples. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > It might be helpful if you'd >>>> >> port a >>>> >> >>> >> simple >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > first >>>> >> >>> >> >>>>> >> >> one >>>> >> >>> >> >>>>> >> >> >> > such >>>> >> >>> >> >>>>> >> >> >> > > as >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > WordCount. >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > Fabian >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > 2014-09-04 18:47 GMT+02:00 >>>> >> Aljoscha >>>> >> >>> >> Krettek >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > < >>>> >> >>> >> >>>>> >> >> >> > > >>> [email protected] >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> Hi, >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> I have a working rewrite of >>>> the >>>> >> >>> Scala >>>> >> >>> >> API >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> here: >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >>>> >> >>> >> >>>>> >> >> >> >>>> >> >>> >> https://github.com/aljoscha/incubator-flink/commits/scala-rework >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> I'm hoping that I'll only >>>> have >>>> >> to >>>> >> >>> >> write >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> the >>>> >> >>> >> >>>>> >> tests >>>> >> >>> >> >>>>> >> >> and >>>> >> >>> >> >>>>> >> >> >> > > port >>>> >> >>> >> >>>>> >> >> >> > > >>> the >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> examples. Do you think it >>>> makes >>>> >> >>> sense >>>> >> >>> >> to >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> let >>>> >> >>> >> >>>>> >> other >>>> >> >>> >> >>>>> >> >> >> > people >>>> >> >>> >> >>>>> >> >> >> > > >>> port >>>> >> >>> >> >>>>> >> >> >> > > >>> >> the >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> examples, so that someone >>>> else >>>> >> uses >>>> >> >>> >> it and >>>> >> >>> >> >>>>> >> maybe >>>> >> >>> >> >>>>> >> >> >> > notices >>>> >> >>> >> >>>>> >> >> >> > > some >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > quirks >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> in the API? >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> Cheers, >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> Aljoscha >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >>>> >> >>> >> >>>>> >> >> >> > > >>> >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> > >>>> >> >>> >> >>>>> >> >> >> >>>> >> >>> >> >>>>> >> >> >>>> >> >>> >> >>>>> >> >>>> >> >>> >> >>>> >>>> >> >>> >> >>>> >>>> >> >>> >> >>> >>>> >> >>> >> >>>> >> >>> >>>> >> >>>>
