Hey Wilson,
the MapFunction should act as a wrapper for the join function. create a
class extending RichMapFunction, and pass the joinfunction via the
constructor. then you delegate open/close calls to it, with the map
function looking something like this:
map(Tuple2<...> tuple) {
return joinFunc
Chesnay Schepler created FLINK-1270:
---
Summary: FileSystem.get() doesn't support relative paths
Key: FLINK-1270
URL: https://issues.apache.org/jira/browse/FLINK-1270
Project: Flink
Chesnay Schepler created FLINK-1248:
---
Summary: Manually built docu doesn't apply CSS/images
Key: FLINK-1248
URL: https://issues.apache.org/jira/browse/FLINK-1248
Project: Flink
Issue
Chesnay Schepler created FLINK-1244:
---
Summary: setCombinable() clunky to use
Key: FLINK-1244
URL: https://issues.apache.org/jira/browse/FLINK-1244
Project: Flink
Issue Type: Wish
Chesnay Schepler created FLINK-1227:
---
Summary: KeySelector can't implement ResultTypeQueryable
Key: FLINK-1227
URL: https://issues.apache.org/jira/browse/FLINK-1227
Project: Flink
I agree with Kostas, and believe that postponing will imo straight up
not work since people tend to be *very* busy close to a release, even
without having to port features to several APIs.
I furthermore don't think we will get anywhere by creating one policy to
rule them all (especially a rigi
?
On Wed, Sep 10, 2014 at 2:30 PM, Chesnay Schepler <
chesnay.schep...@fu-berlin.de> wrote:
only the coordination is done via UDP.
i agree with what you say about the loops; currently looking into using
FileLocks.
On 9.9.2014 11:33, Stephan Ewen wrote:
Hey!
The UDP version is 25x
ephan
On Mon, Sep 8, 2014 at 4:15 PM, Chesnay Schepler <
chesnay.schep...@fu-berlin.de> wrote:
sorry for the late answer.
today i did a quick hack to replace the synchronization completely with
udp. its still synchronous and record based, but 25x slower.
regarding busy-loops i would
Hello,
tonight i was running a WordCount job with the Python API, and halfway
through i got the exception below.
the issue did not occur again after ressubmitting the job.
DOP=160
taskslots=8
filesize=100GB
org.apache.flink.client.program.ProgramInvocationException: The
program execution
the busy loop?
Ufuk
On Thu, Aug 28, 2014 at 1:06 AM, Chesnay Schepler <
chesnay.schep...@fu-berlin.de> wrote:
the performance differences occur on the same system (16GB, 4 cores +
HyperThreading) with a DOP of 1 for a plan consisting of a single operator.
plenty of resources :/
On 28.8.
esentation, right?
On Wed, Aug 27, 2014 at 10:19 PM, Chesnay Schepler
wrote:
i cant recall definitely what the numbers were so I'll just quote myself
from the PR:
measurements were done using System.nanoTime()
time necessary for the comparison
Strings consisted of 90 characters
difference in t
ghput, because the
big buffers were not copied (unlike in streams), and the UDP notifications
were very fast (fire and forget datagrams).
Stephan
On Wed, Aug 27, 2014 at 10:48 PM, Chesnay Schepler <
chesnay.schep...@fu-berlin.de> wrote:
Hey Stephan,
I'd like to point out right aw
, Aug 27, 2014 at 8:34 PM, Chesnay Schepler <
chesnay.schep...@fu-berlin.de> wrote:
Hello everyone,
This will be some kind of brainstorming question.
As some of you may know I am currently working on the Python API. The most
crucial part here is how the data is exchanged between Java and Python.
i cant recall definitely what the numbers were so I'll just quote myself
from the PR:
measurements were done using System.nanoTime()
time necessary for the comparison
Strings consisted of 90 characters
difference in the beginning of the string
New 4259
Old 23431
difference in the middle of the
Hello everyone,
This will be some kind of brainstorming question.
As some of you may know I am currently working on the Python API. The
most crucial part here is how the data is exchanged between Java and Python.
Up to this point we used pipes for this, but switched recently to memory
mapped f
i think this is what martin is currently doing:
StringIDs --map-> (StringIDs,LongIDs) --map-> LongIDs
and he wants to use both the second and third set. he asks for a way to
replace the second map operation. (since it seems unnecessary to create
an extra map for that)
i believe the appropria
[
https://issues.apache.org/jira/browse/FLINK-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037150#comment-14037150
]
Chesnay Schepler edited comment on FLINK-518 at 6/19/14 9:1
[
https://issues.apache.org/jira/browse/FLINK-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037150#comment-14037150
]
Chesnay Schepler commented on FLINK-518:
the file scheme is added if you
18 matches
Mail list logo