Alright, will do.
Thanks!
2014-09-08 17:48 GMT+02:00 Aljoscha Krettek <[email protected]>:
> Ok people, executive decision. :D
>
> Please look at KMeansData.java and KMeans.scala. I'm storing the data
> in multi-dimensional object arrays and then converting it to the
> required Java or Scala objects.
>
> Also, I changed isEqualTo to equalTo to make it consistent with the Java
> API.
>
> Regarding Join (and coGroup). There is no need for a keyword, you can
> just write:
>
> left.join(right).where(0).equalTo(1) { (le, re) => new MyResult(le, re) }
>
> On Mon, Sep 8, 2014 at 2:07 PM, Fabian Hueske <[email protected]> wrote:
> > Aside from the DataSet issue, I also found an inconsistency with the Java
> > API. In Java join is done as:
> >
> > ds1.join(ds2).where(...).equalTo(...)
> >
> > where in the current Scala this is:
> >
> > ds1.join(d2).where(...).isEqualTo(...)
> >
> > isEqualTo() should be renamed to equalTo(), IMO.
> > Also, join (+cross and coGroup?) lacks the with() method because "with"
> is
> > a keyword in Scala. Should be offer something similar for Scala or go
> with
> > map() on Tuple2(left, right)?
> >
> > 2014-09-08 13:51 GMT+02:00 Stephan Ewen <[email protected]>:
> >
> >> Instead of Strings, Object[][] would work as well. That is a generic
> >> representation of a Tuple.
> >>
> >> Alternatively, they could be stored as Java or Scala Tuples, with a
> generic
> >> utility method to convert between the two.
> >>
> >> On Mon, Sep 8, 2014 at 10:55 AM, Fabian Hueske <[email protected]>
> wrote:
> >>
> >> > Yeah, I ran into the same problem...
> >> >
> >> > +1 for using Strings and parsing them, but using the CSVFormat won't
> >> work
> >> > because this is based on a FileInputFormat.
> >> > So we would need to parse the Strings manually...
> >> >
> >> > 2014-09-08 10:35 GMT+02:00 Aljoscha Krettek <[email protected]>:
> >> >
> >> > > Hi,
> >> > > on second thought. Maybe we should just change all the example input
> >> > > data to strings and use CSV input formats in all the examples. What
> do
> >> > > you think?
> >> > >
> >> > > Cheers,
> >> > > Aljoscha
> >> > >
> >> > > On Mon, Sep 8, 2014 at 7:46 AM, Aljoscha Krettek <
> [email protected]>
> >> > > wrote:
> >> > > > Hi,
> >> > > > yes it's unfortunate that the data types are incompatible. I'm
> afraid
> >> > > > you have to to what you proposed: move the data to a static field
> and
> >> > > > convert it in the getDefaultEdgeDataSet() method in Scala. It's
> not
> >> > > > nice, but copying would duplicate the data and make it easier for
> it
> >> > > > to go out of sync in the Java and Scala versions.
> >> > > >
> >> > > > What do the others think? This will probably occur in all the
> >> examples.
> >> > > >
> >> > > > Cheers,
> >> > > > Aljoscha
> >> > > >
> >> > > > On Sun, Sep 7, 2014 at 10:04 PM, Vasiliki Kalavri
> >> > > > <[email protected]> wrote:
> >> > > >> Hey,
> >> > > >>
> >> > > >> I have ported the Connected Components example, but I am not sure
> >> how
> >> > to
> >> > > >> reuse the example input data from java-examples.
> >> > > >> In the ConnectedComponentsData class, the vertices and edges data
> >> are
> >> > > >> produced by the methods getDefaultVertexDataSet()
> >> > > >> and getDefaultEdgeDataSet(), which take
> >> > > >> an org.apache.flink.api.java.ExecutionEnvironment as parameter.
> >> > > >>
> >> > > >> One way is to provide public static fields (like in the
> >> WordCountData
> >> > > >> class), but this introduces a conversion
> >> > > >> from org.apache.flink.api.java.tuple.Tuple2 to Scala tuple and
> from
> >> > > >> java.lang.Long to scala.Long and I guess this is an unnecessary
> >> > > complexity
> >> > > >> for an example (?).
> >> > > >> Another way is, of course, to copy the example data in the Scala
> >> > > example.
> >> > > >>
> >> > > >> Am I missing something here?
> >> > > >>
> >> > > >> Thanks!
> >> > > >>
> >> > > >> Cheers,
> >> > > >> V.
> >> > > >>
> >> > > >>
> >> > > >> On 5 September 2014 15:52, Aljoscha Krettek <[email protected]
> >
> >> > > wrote:
> >> > > >>
> >> > > >>> Alright, I updated my repo:
> >> > > >>>
> https://github.com/aljoscha/incubator-flink/commits/scala-rework
> >> > > >>>
> >> > > >>> This now has a working WordCount example. It's pretty much a
> copy
> >> of
> >> > > >>> the Java example with some fixups for the syntax and lambda
> >> > functions.
> >> > > >>> You'll also notice that I added the java-examples as a
> dependency
> >> for
> >> > > >>> the scala-examples. I did this to reuse the example input data.
> >> > > >>>
> >> > > >>> When you ported a program you can do a pull request against my
> repo
> >> > > >>> and I will collect the examples.
> >> > > >>>
> >> > > >>> Happy coding. :D
> >> > > >>>
> >> > > >>> On Fri, Sep 5, 2014 at 12:19 PM, Hermann Gábor <
> >> [email protected]
> >> > >
> >> > > >>> wrote:
> >> > > >>> > +1
> >> > > >>> >
> >> > > >>> > ComputeEdgeDegrees for me!
> >> > > >>> >
> >> > > >>> >
> >> > > >>> > On Fri, Sep 5, 2014 at 11:44 AM, Márton Balassi <
> >> > > >>> [email protected]>
> >> > > >>> > wrote:
> >> > > >>> >
> >> > > >>> >> +1
> >> > > >>> >>
> >> > > >>> >> BatchGradientDescent for me :)
> >> > > >>> >>
> >> > > >>> >>
> >> > > >>> >> On Fri, Sep 5, 2014 at 11:15 AM, Kostas Tzoumas <
> >> > > [email protected]>
> >> > > >>> >> wrote:
> >> > > >>> >>
> >> > > >>> >> > +1
> >> > > >>> >> >
> >> > > >>> >> > I go for WebLogAnalysis.
> >> > > >>> >> >
> >> > > >>> >> > My experience with Scala consists of going through a
> tutorial
> >> so
> >> > > this
> >> > > >>> >> will
> >> > > >>> >> > be a good stress test both for me and the new API :-)
> >> > > >>> >> >
> >> > > >>> >> >
> >> > > >>> >> > On Thu, Sep 4, 2014 at 9:09 PM, Vasiliki Kalavri <
> >> > > >>> >> > [email protected]>
> >> > > >>> >> > wrote:
> >> > > >>> >> >
> >> > > >>> >> > > +1 for having other people implement the examples!
> >> > > >>> >> > > Connected Components and Kmeans for me :)
> >> > > >>> >> > >
> >> > > >>> >> > > -V.
> >> > > >>> >> > >
> >> > > >>> >> > >
> >> > > >>> >> > > On 4 September 2014 21:03, Fabian Hueske <
> >> [email protected]>
> >> > > >>> wrote:
> >> > > >>> >> > >
> >> > > >>> >> > > > I go for TriangleEnumeration and PageRank.
> >> > > >>> >> > > >
> >> > > >>> >> > > > Let's also do the examples similar to the Java
> examples:
> >> > > >>> >> > > > - running out-of-the-box without parameters
> >> > > >>> >> > > > - parameters for external data
> >> > > >>> >> > > > - follow a similar code structure
> >> > > >>> >> > > >
> >> > > >>> >> > > >
> >> > > >>> >> > > >
> >> > > >>> >> > > > 2014-09-04 20:56 GMT+02:00 Aljoscha Krettek <
> >> > > [email protected]
> >> > > >>> >:
> >> > > >>> >> > > >
> >> > > >>> >> > > > > Will do, then people can reserve their favourite
> >> examples
> >> > > here.
> >> > > >>> >> > > > >
> >> > > >>> >> > > > > On Thu, Sep 4, 2014 at 8:55 PM, Fabian Hueske <
> >> > > >>> [email protected]>
> >> > > >>> >> > > > wrote:
> >> > > >>> >> > > > > > Hi,
> >> > > >>> >> > > > > >
> >> > > >>> >> > > > > > I think having examples implemented by different
> >> people
> >> > > >>> proved to
> >> > > >>> >> > be
> >> > > >>> >> > > > > > valuable in the past.
> >> > > >>> >> > > > > > I'd help with two or three examples.
> >> > > >>> >> > > > > >
> >> > > >>> >> > > > > > It might be helpful if you'd port a simple first
> one
> >> > such
> >> > > as
> >> > > >>> >> > > WordCount.
> >> > > >>> >> > > > > >
> >> > > >>> >> > > > > > Fabian
> >> > > >>> >> > > > > >
> >> > > >>> >> > > > > >
> >> > > >>> >> > > > > > 2014-09-04 18:47 GMT+02:00 Aljoscha Krettek <
> >> > > >>> [email protected]
> >> > > >>> >> >:
> >> > > >>> >> > > > > >
> >> > > >>> >> > > > > >> Hi,
> >> > > >>> >> > > > > >> I have a working rewrite of the Scala API here:
> >> > > >>> >> > > > > >>
> >> > > >>> >>
> >> https://github.com/aljoscha/incubator-flink/commits/scala-rework
> >> > > >>> >> > > > > >>
> >> > > >>> >> > > > > >> I'm hoping that I'll only have to write the tests
> and
> >> > > port
> >> > > >>> the
> >> > > >>> >> > > > > >> examples. Do you think it makes sense to let other
> >> > people
> >> > > >>> port
> >> > > >>> >> the
> >> > > >>> >> > > > > >> examples, so that someone else uses it and maybe
> >> > notices
> >> > > some
> >> > > >>> >> > quirks
> >> > > >>> >> > > > > >> in the API?
> >> > > >>> >> > > > > >>
> >> > > >>> >> > > > > >> Cheers,
> >> > > >>> >> > > > > >> Aljoscha
> >> > > >>> >> > > > > >>
> >> > > >>> >> > > > >
> >> > > >>> >> > > >
> >> > > >>> >> > >
> >> > > >>> >> >
> >> > > >>> >>
> >> > > >>>
> >> > >
> >> >
> >>
>