I think Aljoscha's suspicion is correct. Eclipse let's you also reference
test code from main code, so it seems not to separate code. An extra
scala-tests project is a good idea.
A workaround: call maven install on the Shell and close the scala projects
in Eclipse. The flink-tests project will the
Hi Robert,
this might be a problem with Eclipse not having a strict separation
between compiling the src/main and src/test code. The code that
generates the TypeInformation is a macro. Macros are only usable if
the code that uses them is compiled in a separate compilation step
from the compilation
Hi Robert, I didn't have problem with IntelliJ idea last week, will try it
again with latest master.
User can deactivate unwanted maven profile via idea's maven project option
(not remember exact name of that feature)
On Sunday, September 28, 2014, Robert Metzger wrote:
> I've worked on a quit
I've worked on a quite outdated version of Flink for a while now and
rebased my code to the latest master on Friday.
Back at home, I wanted to continue my work and found that it is very
difficult to properly set up the latest eclipse for Flink.
What I've done so far:
- Downloaded Eclipse Luna SR1
Answer posted to "Example packages naming convention" thread as the issue
diverged from this topic.
On Sun, Sep 14, 2014 at 11:14 AM, Kostas Tzoumas
wrote:
> Good catch, I suggest to use examples
>
> On Sat, Sep 13, 2014 at 3:27 PM, Márton Balassi
> wrote:
>
> > Pull request issued. One minor n
Good catch, I suggest to use examples
On Sat, Sep 13, 2014 at 3:27 PM, Márton Balassi
wrote:
> Pull request issued. One minor naming concern:
>
> As of today the scala examples are located at
> the org.apache.flink.examples.scala package, while the java ones in
> the org.apache.flink.example.jav
Pull request issued. One minor naming concern:
As of today the scala examples are located at
the org.apache.flink.examples.scala package, while the java ones in
the org.apache.flink.example.java. I suggest using only one convention for
this either example or examples.
Cheers,
Marton
On Fri, Sep
Sorry for being a bit silent after already bidding on LR. The pull request
is coming soon.
On Fri, Sep 12, 2014 at 6:25 PM, Stephan Ewen wrote:
> I suppose that having the option between simple return type, and a
> collector is the easiest to understand.
> Am 12.09.2014 16:50 schrieb "Aljoscha
I suppose that having the option between simple return type, and a
collector is the easiest to understand.
Am 12.09.2014 16:50 schrieb "Aljoscha Krettek" :
> So, should I change join and coGroup to have a simple return value, no
> Option or Collection? Also what's happening with the relational
>
So, should I change join and coGroup to have a simple return value, no
Option or Collection? Also what's happening with the relational
examples and the LinearRegression examples? I'd like to make a pull
request before this weekend.
I also added a test that checks whether the Scala API has the same
Yes, there is already a Collector version, you can do:
left.join(right).where("foo").equalTo("bar") {
(left, right, out: Collector[Page]) =>
if (...) out.collect(...)
}
I wasn't sure on what our Function2 variant should be. That's why I
asked. There are some cases where you want to have the
I think it seems weird that normal joins need to go through option.
The option variant is to allow filters in the join function. Wouldn't a
collector variant allow you to do the same, and would be function3 ? I know
that option reads more functionally...
Am 12.09.2014 14:24 schrieb "Aljoscha Kr
As already mentioned this is not possible because of type erasure. We
can only have one join variant that takes a Function2.
On Fri, Sep 12, 2014 at 12:34 PM, Stephan Ewen wrote:
> It would be nice to have a join variant that directly returns the value
> rathern than an option. Why not have both
It would be nice to have a join variant that directly returns the value
rathern than an option. Why not have both (they are wrapped as flatJoins
anyway below, right?)
On Fri, Sep 12, 2014 at 11:50 AM, Fabian Hueske wrote:
> Sweet! I'm lovin' this :-)
>
> 2014-09-12 11:46 GMT+02:00 Aljoscha Krett
Sweet! I'm lovin' this :-)
2014-09-12 11:46 GMT+02:00 Aljoscha Krettek :
> Also, you can use CaseClasses directly as the type for CSV input. So
> instead of reading it as tuples and then having a mapper that maps to
> your case classes you can use:
>
> env.readCsv[Edge](...)
>
> On Fri, Sep 12, 2
Also, you can use CaseClasses directly as the type for CSV input. So
instead of reading it as tuples and then having a mapper that maps to
your case classes you can use:
env.readCsv[Edge](...)
On Fri, Sep 12, 2014 at 11:43 AM, Aljoscha Krettek wrote:
> I added support for specifying keys by name
I added support for specifying keys by name for CaseClasses. Check out
the PageRank and TriangleEnumeration examples to see it in action.
@Kostas: I think you could use them for the TPC-H examples.
On Fri, Sep 12, 2014 at 7:23 AM, Aljoscha Krettek wrote:
> Yes, that would allow list comprehensio
Yes, that would allow list comprehensions. It would be possible to
have the Collection signature for join (and coGroup), i.e.:
apply[R]((T, O) => TraversableOnce[O]): DataSet[O]
(T and O are the left and right input type, R is result type)
Then you can return collections and still return an opti
Hmmm, tricky question...
How about the Option for Join as this is a tuple-wise operation and the
Collection for Cogroup which is group-wise?
Could we in that case use list comprehensions in Cogroup functions?
Or is that too much mixing?
2014-09-11 23:00 GMT+02:00 Aljoscha Krettek :
> I didn't lo
I didn't look at the example either.
Addings collections is easy, it's just that we can either have
Collections or the Option, not both.
For the coding style I followed this:
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide,
which itself is based on this: http://docs.scala
I haven't looked at the LineRank example in detail, but if you think that
it adds something new to the examples collection, we can certainly port it
also to Java.
I think the Option and Collector return types are sufficient right now but
if Collections are easy to add, go for it. ;-)
Great that th
What about the LineRank example? We had that in Scala but never had a
Java Example.
On Thu, Sep 11, 2014 at 5:51 PM, Aljoscha Krettek wrote:
> Yes, I like that. For the ITCases I always just copied the Java ITCase.
>
> The only examples that are missing now are LinearRegression and the
> relation
Yes, I like that. For the ITCases I always just copied the Java ITCase.
The only examples that are missing now are LinearRegression and the
relational stuff.
On Thu, Sep 11, 2014 at 5:48 PM, Fabian Hueske wrote:
> I just removed the old CountEdgeDegrees example.
> That was a preprocessing step f
I will port PiEstimation now that generateSequence is in, as well as TPC-H
Q3
Kostas
On Thu, Sep 11, 2014 at 5:40 PM, Aljoscha Krettek
wrote:
> I added the PageRank example, thanks again fabian. :D
>
> Regarding the other stuff:
> - There is a comment in DataSet.scala about including
> org.apa
I just removed the old CountEdgeDegrees example.
That was a preprocessing step for the TriangleEnumeration, and is now part
of the new TriangleEnumerationOpt example.
So I guess, we don't need to port that one. As I said before, I'd prefer to
keep Java and Scala examples in sync.
Cheers, Fabian
2
I added the PageRank example, thanks again fabian. :D
Regarding the other stuff:
- There is a comment in DataSet.scala about including
org.apache.flink.api.scala._ because of the TypeInformation.
- I added generateSequence to ExecutionEnvironment.
- It is possible to use Scala Primitives in Arr
+1 for removing RelationQuery
On Thu, Sep 11, 2014 at 3:04 PM, Aljoscha Krettek
wrote:
> By the way, what was called BatchGradientDescent in the Scala examples
> should be replaced by a port of the LinearRegression Example from
> Java. I had them as two separate examples earlier.
>
> What about
+1 for removing RelationalQuery
IMO, the Scala examples should mirror the Java examples. So, we should
rather port Java examples to Scala instead of updating existing Scala
examples.
I am also done with the PageRank implementation. Final tests are currently
running and I'll open a PR soon.
I foun
By the way, what was called BatchGradientDescent in the Scala examples
should be replaced by a port of the LinearRegression Example from
Java. I had them as two separate examples earlier.
What about RelationalQuery and TPC-H-Q3. Any thoughts about removing
RelationalQuery?
On Thu, Sep 11, 2014 at
I added the Triangle Enumeration Examples, thanks Fabian.
So far we have ported: WordCount, KMeans, ConnectedComponents,
WebLogAnalysis, TransitiveClosureNaive, TriangleEnumerationNaive/Opt
These are the examples people called dibs on:
- PageRank (Fabian)
- BatchGradientDescent (Márton)
- Comp
Thanks, I added it. I'll keep a running list of ported/unported
examples in my mails. I'll rename the java example package to examples
once the Scala API merge is done.
I think the termination criterion is fine as it is. Just because Scala
enables functional programming doesn't mean it's always th
I'll take TransitiveClosure and PiEstimation (was not on your list).
If nobody volunteers for the relational stuff I can take those as well.
How about removing the "RelationalQuery" from both Scala and Java? It seems
to be a proper subset of TPC-H Q3. Does it add some teaching value on top
of TPC
Thanks, I added it, along with an ITCase.
So far we have ported: WordCount, KMeans, ConnectedComponents, WebLogAnalysis
These are the examples people called dibs on:
- TriangleEnumration and PageRank (Fabian)
- BatchGradientDescent (Márton)
- ComputeEdgeDegrees (Hermann)
Those are unclaimed (
WebLog here:
https://github.com/ktzoumas/incubator-flink/tree/webloganalysis-example-scala
Do you need any more done?
On Tue, Sep 9, 2014 at 3:08 PM, Aljoscha Krettek
wrote:
> I added the ConnectedComponents Example from Vasia.
>
> Keep 'em coming, people. :D
>
> On Mon, Sep 8, 2014 at 6:07 PM,
I added the ConnectedComponents Example from Vasia.
Keep 'em coming, people. :D
On Mon, Sep 8, 2014 at 6:07 PM, Fabian Hueske wrote:
> Alright, will do.
> Thanks!
>
> 2014-09-08 17:48 GMT+02:00 Aljoscha Krettek :
>
>> Ok people, executive decision. :D
>>
>> Please look at KMeansData.java and KMe
Alright, will do.
Thanks!
2014-09-08 17:48 GMT+02:00 Aljoscha Krettek :
> Ok people, executive decision. :D
>
> Please look at KMeansData.java and KMeans.scala. I'm storing the data
> in multi-dimensional object arrays and then converting it to the
> required Java or Scala objects.
>
> Also, I ch
Ok people, executive decision. :D
Please look at KMeansData.java and KMeans.scala. I'm storing the data
in multi-dimensional object arrays and then converting it to the
required Java or Scala objects.
Also, I changed isEqualTo to equalTo to make it consistent with the Java API.
Regarding Join (a
Aside from the DataSet issue, I also found an inconsistency with the Java
API. In Java join is done as:
ds1.join(ds2).where(...).equalTo(...)
where in the current Scala this is:
ds1.join(d2).where(...).isEqualTo(...)
isEqualTo() should be renamed to equalTo(), IMO.
Also, join (+cross and coGrou
Instead of Strings, Object[][] would work as well. That is a generic
representation of a Tuple.
Alternatively, they could be stored as Java or Scala Tuples, with a generic
utility method to convert between the two.
On Mon, Sep 8, 2014 at 10:55 AM, Fabian Hueske wrote:
> Yeah, I ran into the sam
Yeah, I ran into the same problem...
+1 for using Strings and parsing them, but using the CSVFormat won't work
because this is based on a FileInputFormat.
So we would need to parse the Strings manually...
2014-09-08 10:35 GMT+02:00 Aljoscha Krettek :
> Hi,
> on second thought. Maybe we should j
+1: If we opted for that we could easily use the same input for streaming
as well - we've been facing the same issue recently.
On Mon, Sep 8, 2014 at 10:35 AM, Aljoscha Krettek
wrote:
> Hi,
> on second thought. Maybe we should just change all the example input
> data to strings and use CSV input
Hi,
on second thought. Maybe we should just change all the example input
data to strings and use CSV input formats in all the examples. What do
you think?
Cheers,
Aljoscha
On Mon, Sep 8, 2014 at 7:46 AM, Aljoscha Krettek wrote:
> Hi,
> yes it's unfortunate that the data types are incompatible. I
Hi,
yes it's unfortunate that the data types are incompatible. I'm afraid
you have to to what you proposed: move the data to a static field and
convert it in the getDefaultEdgeDataSet() method in Scala. It's not
nice, but copying would duplicate the data and make it easier for it
to go out of sync
Hey,
I have ported the Connected Components example, but I am not sure how to
reuse the example input data from java-examples.
In the ConnectedComponentsData class, the vertices and edges data are
produced by the methods getDefaultVertexDataSet()
and getDefaultEdgeDataSet(), which take
an org.apac
Alright, I updated my repo:
https://github.com/aljoscha/incubator-flink/commits/scala-rework
This now has a working WordCount example. It's pretty much a copy of
the Java example with some fixups for the syntax and lambda functions.
You'll also notice that I added the java-examples as a dependency
+1
ComputeEdgeDegrees for me!
On Fri, Sep 5, 2014 at 11:44 AM, Márton Balassi
wrote:
> +1
>
> BatchGradientDescent for me :)
>
>
> On Fri, Sep 5, 2014 at 11:15 AM, Kostas Tzoumas
> wrote:
>
> > +1
> >
> > I go for WebLogAnalysis.
> >
> > My experience with Scala consists of going through a tu
+1
BatchGradientDescent for me :)
On Fri, Sep 5, 2014 at 11:15 AM, Kostas Tzoumas wrote:
> +1
>
> I go for WebLogAnalysis.
>
> My experience with Scala consists of going through a tutorial so this will
> be a good stress test both for me and the new API :-)
>
>
> On Thu, Sep 4, 2014 at 9:09 PM
+1
I go for WebLogAnalysis.
My experience with Scala consists of going through a tutorial so this will
be a good stress test both for me and the new API :-)
On Thu, Sep 4, 2014 at 9:09 PM, Vasiliki Kalavri
wrote:
> +1 for having other people implement the examples!
> Connected Components and
+1 for having other people implement the examples!
Connected Components and Kmeans for me :)
-V.
On 4 September 2014 21:03, Fabian Hueske wrote:
> I go for TriangleEnumeration and PageRank.
>
> Let's also do the examples similar to the Java examples:
> - running out-of-the-box without paramete
I go for TriangleEnumeration and PageRank.
Let's also do the examples similar to the Java examples:
- running out-of-the-box without parameters
- parameters for external data
- follow a similar code structure
2014-09-04 20:56 GMT+02:00 Aljoscha Krettek :
> Will do, then people can reserve thei
Will do, then people can reserve their favourite examples here.
On Thu, Sep 4, 2014 at 8:55 PM, Fabian Hueske wrote:
> Hi,
>
> I think having examples implemented by different people proved to be
> valuable in the past.
> I'd help with two or three examples.
>
> It might be helpful if you'd port
Hi,
I think having examples implemented by different people proved to be
valuable in the past.
I'd help with two or three examples.
It might be helpful if you'd port a simple first one such as WordCount.
Fabian
2014-09-04 18:47 GMT+02:00 Aljoscha Krettek :
> Hi,
> I have a working rewrite of
Hi,
I have a working rewrite of the Scala API here:
https://github.com/aljoscha/incubator-flink/commits/scala-rework
I'm hoping that I'll only have to write the tests and port the
examples. Do you think it makes sense to let other people port the
examples, so that someone else uses it and maybe no
53 matches
Mail list logo