[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-12-01 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r46264229 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbStateBackend.java --- @@ -0,0 +1,248 @@ +/*

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-12-01 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r46264128 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbStateBackend.java --- @@ -0,0 +1,248 @@ +/*

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-12-01 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r46264109 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbStateBackend.java --- @@ -0,0 +1,248 @@ +/*

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1305 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-24 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-159205217 I think you can go ahead. It's in contrib and you guys are battle-testing it anyways... :wink: --- If your project is set up for it, you can reply to this email and h

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-23 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-159048039 If no objections I would like to merge this :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-22 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-158799081 @StephanEwen, @rmetzger: I addressed the comments regarding the logs and the state id. I also added a final improvement: -Now compaction is executed i

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-21 Thread rmetzger
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45544375 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbStateHandle.java --- @@ -0,0 +1,88 @@ +/* +

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-20 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45479951 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbStateHandle.java --- @@ -0,0 +1,88 @@ +/* + *

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-20 Thread rmetzger
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45479621 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbStateHandle.java --- @@ -0,0 +1,88 @@ +/* +

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-20 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-158423218 Had an offline chat with @gyfora with the following outcome: - A deterministic state identifier is needed - Small change to pass that identifier as a singl

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-20 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45474858 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state bac

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-20 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45473713 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state backend,

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45377370 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state backend,

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45373479 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state bac

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45372569 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state backend,

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45372620 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state bac

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45372303 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state backend,

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45371919 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state bac

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45371097 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state backend,

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-158107515 I have a final comment inline. Otherwise, I think this is good to merge. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45362408 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java --- @@ -64,33 +64,35 @@ /** * Closes the state bac

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora
Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45351896 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbBackendConfig.java --- @@ -0,0 +1,406 @@ +/*

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/1305#discussion_r45351298 --- Diff: flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbBackendConfig.java --- @@ -0,0 +1,406 @@ +/*

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-158074782 Looking though this again... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-16 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-156964684 I'm looking at it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-15 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-156809400 I have updated the description, and ran some more cluster tests without any issues. It would be good if you all could do a second round of reviews please. --- I

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-12 Thread uce
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-156064645 I totally understand your point, but I think it's OK that changes of this scope take longer to review and get in (my HA PR took over a month or so to get in). At the end of

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-12 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-156039726 Well, I don't know what they are working on... It would be easier not having to rebase state backend api changes --- If your project is set up for it, you can reply to t

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-12 Thread uce
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-156038439 I agree that this will be an important backend and good to have in. :) But do we need to push this right now? I think we should wait a little and make sure that it fits well

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-12 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-156034877 I would like to push this soon if no objections --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-11 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-155765000 Should we do a final iteration over this and merge this to contrib? The description got slightly out of date when I changed this back so that it stores the state

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-04 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153872782 I updated the sharding logic to do mod hashing by default on the keys for the number of shards, and the user can also add a custom Partitioner to implement custom shardin

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-04 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153696208 I also removed the sharding logic now as I think it was pretty weak and not very useful (it maintained 1 connection per subtask which would break if we change parallelism

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-04 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153695943 I updated this PR with the reworked logic, it has several advantages over the previous timestamp based solution (including the elimination of transactions from the logic)

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153500173 I am now writing a prototype for this version, the batch insert got pretty ugly... I will probably finish it tomorrow. --- If your project is set up for it, you can

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153470424 After some initial discussion we @StephanEwen we came to the following conclusions: The current timestamp based approach has some limitations in terms of the ass

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153376152 My last commit introduces automatic compaction with user specified frequency. It also allows the KvStates to implement the CheckpointNotifier interface in which case they

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153351332 @gyfora That's what I meant, basically the timestamps could subsume the role of the checkpointIds. I.e. The checkpointIds have the semantics of the timestamps and the t

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153339094 @StephanEwen Thanks for the comments. You are right the main idea is exactly as you described. The reason why exactly-once is violated in some corner cases becau

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153337412 You are right about the checkpointIDs being ignored. I don't see why we need to bother with changing the id semantics, the timestamps also serve as checkpoint Ids perfect

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-15538 2. Ah, I meant the lookupTimestamp. In an earlier version you used both the checkpointId and lookupTimestamp to perform key lookups. 3. I see, in this implementation

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153307495 @aljoscha 1. I was initially using the MockEnvironments but I added the DummyEnvironment for several reasons: I wanted control over the JobId and the number of subt

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153305943 Just some remarks: - `DummyEnvironment` seems unnecessary, we already have `StreamMockEnvironment`. I think it could be reused. - In the first version you

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-03 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-153289042 Cool stuff, really! This is very much in line with what I had in mind for a SQL backend. Let me check if I understood everything correct (and see where my u

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-02 Thread uce
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-152948336 Thanks for the great write up! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-01 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-152944197 Good stuff! Will need a day more to look through this, but this is a cool way of doing stateful stream computation :-) --- If your project is set up for it, you can

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-10-31 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-152767697 My last commit (that was meant to solve the problems with failed tasks writing to the db) introduced some issues with the exactly once guarantees. I will look into it tom

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-10-28 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1305#issuecomment-151884357 Wow, a lot of stuff. I will look into it once the release is out. :smiley: --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-10-27 Thread gyfora
GitHub user gyfora opened a pull request: https://github.com/apache/flink/pull/1305 Out-of-core state backend for JDBC databases Detailed description incoming... You can merge this pull request into a Git repository by running: $ git pull https://github.com/gyfora/flink master