Re: Dataset and select/split functionality

2017-03-03 Thread CPC
Hi Fabian, Thank you for your explanation. Also can you give an example on how the optimizer behaves on the assumption that the outputs of a function are replicated? Thank you... On 3 March 2017 at 13:52, Fabian Hueske <fhue...@gmail.com> wrote: > Hi CPC, > > we had s

Dataset and select/split functionality

2017-03-02 Thread CPC
Hi all, We will try to implement select/split functionality for batch api. We looked at streaming side and understand how it works but since streaming side does not include an optimizer it was easier. Since adding such a runtime operator will affect optimizer layer as well, is there a part that

Re: Dataset and eager scheduling

2017-03-02 Thread CPC
Hi Till, Thank you. On 2 March 2017 at 18:13, Till Rohrmann <trohrm...@apache.org> wrote: > Hi CPC, > > I think that the optimizer does not take the scheduling mode into account > when optimizing a Flink job. > > Cheers, > Till > > On Thu, Mar 2, 2017 at

Dataset and eager scheduling

2017-03-02 Thread CPC
Hi all, Currently our team trying implement a runtime operator also playing with scheduler. We are trying to understand batch optimizer but it will take some time. What we want to know is whether changing batch scheduling mode from LAZY_FROM_SOURCES to EAGER could affect optimizer? I mean whether

Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-10-26 Thread CPC
Is it just related to stream api? This feature could be really useful for etl scenarios with dataset api as well. On Oct 26, 2016 22:29, "Fabian Hueske" wrote: > Hi Chen, > > thanks for this interesting proposal. I think side output would be a very > valuable feature to have!

release 1.1 rc2 confusion

2016-08-07 Thread CPC
Hi, On flink downloads page it says that the latest version is 1.1 and have links for 1.1 binaries but hasnt rc2 voted recently? Are those binaries 1.1 or rc2?

Re: DagConnection tempMode vs breakPipeline property

2016-07-04 Thread CPC
with the optimizer, but I would expect everything that goes back to > the DataExchangeMode to be correct. The rest should be an artifact of > the old pipeline breaker logic and not be used any more. Does this > help in any way? > > On Thu, Jun 16, 2016 at 4:46 PM, CPC <acha...@gmail

Re: offheap memory allocation and memory leak bug

2016-06-20 Thread CPC
s, this should actually improve the runtime performance at the > cost of a slightly longer start-up time for your TaskManagers. > > Cheers, > Till > ​ > > On Sun, Jun 19, 2016 at 6:16 PM, CPC <acha...@gmail.com> wrote: > > > Hi, > > > > I think i found so

Re: offheap memory allocation and memory leak bug

2016-06-19 Thread CPC
ct-when-JNA-not-available-td6977711.html ). I think currently only way to limit memory usage in flink if you want to use same taskmanager across jobs is via "taskmanager.memory.preallocate: true". Since it allocate memory at the beginning and not freed its memory usage stays constant. PS:

Re: offheap memory allocation and memory leak bug

2016-06-18 Thread CPC
skManager --configDir > /home/capacman/Data/programlama/flink-1.0.3/conf > but memory usage reach up to 13Gb. Could somebodey explain me why memory usage is so high? I expect it to be at most 8GB with some jvm internal overhead. [image: Inline images 1] [image: Inline images 2] On 17 June 2

offheap memory allocation and memory leak bug

2016-06-17 Thread CPC
Hi, I am making some test about offheap memory usage and encounter an odd behavior. My taskmanager heap limit is 12288 Mb and when i set "taskmanager.memory.off-hep:true" for every job it allocates 11673 Mb off heap area at most which is heapsize*0.95(value of taskmanager.memory.fraction). But

DagConnection tempMode vs breakPipeline property

2016-06-16 Thread CPC
Hi everybody, I am trying to understand flink optimizer in DataSet api. But i dont understant tempMode and breakPipeline properties. TempMode enum also has a breakPipeline property but they both accessed and set in different places. DagConnection.breakPipeline is used to select network io

Re: incremental Checkpointing , Rocksdb HA

2016-06-09 Thread CPC
Cassandra backend would be interesting especially if flink could benefit from cassandra data locality. Cassandra/spark integration is using this for information to schedule spark tasks. On 9 June 2016 at 19:55, Nick Dimiduk wrote: > You might also consider support for a

Re: DataStream split/select behaviour

2016-06-07 Thread CPC
Data/wiki/14533") env.execute() } } On 7 June 2016 at 19:26, CPC <acha...@gmail.com> wrote: > Hello everyone, > > When i use DataStream split/select,it always send all selected records to > same taskmanager. Is there any reason for this behaviour? Also is it > p

DataStream split/select behaviour

2016-06-07 Thread CPC
Hello everyone, When i use DataStream split/select,it always send all selected records to same taskmanager. Is there any reason for this behaviour? Also is it possible to implement same split/select behaviour for DataSet api(without using a different filter for every output )? I found this

Re: Dataset split/demultiplex

2016-05-12 Thread CPC
= ... > val split1: DataSet[A] = xs.filter(f1) > val split2: DataSet[A] = xs.filter(f2) > > where f1 and f2 are true for those elements that should go into the > first and second DataSets respectively. So far, the splits will just > contain elements from the input DataSet, but you can

Re: Data locality and scheduler

2016-04-26 Thread CPC
will be remotely read. > > Best, Fabian > > > 2016-04-26 16:59 GMT+02:00 CPC <acha...@gmail.com>: > > > Hi, > > > > I look at some scheduler documentations but could not find answer to my > > question. My question is: suppose that i have a big file on 4

Data locality and scheduler

2016-04-26 Thread CPC
Hi, I look at some scheduler documentations but could not find answer to my question. My question is: suppose that i have a big file on 40 node hadoop cluster and since it is a big file every node has at least one chunk of the file. If i write a flink job and want to filter file and if job has

Re: Flink optimizer optimizations

2016-04-17 Thread CPC
Himmm i understand now. Thank you guys:) On Apr 16, 2016 2:21 PM, "Matthias J. Sax" wrote: > Sure. WITHOUT. > > Thanks. Good catch :) > > On 04/16/2016 01:18 PM, Ufuk Celebi wrote: > > On Sat, Apr 16, 2016 at 1:05 PM, Matthias J. Sax > wrote: > >> (with the