Hi Fabian,
Thank you for your explanation. Also can you give an example on how the
optimizer behaves on the assumption that the outputs of a function are
replicated?
Thank you...
On 3 March 2017 at 13:52, Fabian Hueske <fhue...@gmail.com> wrote:
> Hi CPC,
>
> we had s
Hi all,
We will try to implement select/split functionality for batch api. We
looked at streaming side and understand how it works but since streaming
side does not include an optimizer it was easier. Since adding such a
runtime operator will affect optimizer layer as well, is there a part that
Hi Till,
Thank you.
On 2 March 2017 at 18:13, Till Rohrmann <trohrm...@apache.org> wrote:
> Hi CPC,
>
> I think that the optimizer does not take the scheduling mode into account
> when optimizing a Flink job.
>
> Cheers,
> Till
>
> On Thu, Mar 2, 2017 at
Hi all,
Currently our team trying implement a runtime operator also playing with
scheduler. We are trying to understand batch optimizer but it will take
some time. What we want to know is whether changing batch scheduling mode
from LAZY_FROM_SOURCES to EAGER could affect optimizer? I mean whether
Is it just related to stream api? This feature could be really useful for
etl scenarios with dataset api as well.
On Oct 26, 2016 22:29, "Fabian Hueske" wrote:
> Hi Chen,
>
> thanks for this interesting proposal. I think side output would be a very
> valuable feature to have!
Hi,
On flink downloads page it says that the latest version is 1.1 and have
links for 1.1 binaries but hasnt rc2 voted recently? Are those binaries 1.1
or rc2?
with the optimizer, but I would expect everything that goes back to
> the DataExchangeMode to be correct. The rest should be an artifact of
> the old pipeline breaker logic and not be used any more. Does this
> help in any way?
>
> On Thu, Jun 16, 2016 at 4:46 PM, CPC <acha...@gmail
s, this should actually improve the runtime performance at the
> cost of a slightly longer start-up time for your TaskManagers.
>
> Cheers,
> Till
>
>
> On Sun, Jun 19, 2016 at 6:16 PM, CPC <acha...@gmail.com> wrote:
>
> > Hi,
> >
> > I think i found so
ct-when-JNA-not-available-td6977711.html
).
I think currently only way to limit memory usage in flink if you want to
use same taskmanager across jobs is via "taskmanager.memory.preallocate:
true". Since it allocate memory at the beginning and not freed its memory
usage stays constant.
PS:
skManager --configDir
> /home/capacman/Data/programlama/flink-1.0.3/conf
>
but memory usage reach up to 13Gb. Could somebodey explain me why memory
usage is so high? I expect it to be at most 8GB with some jvm internal
overhead.
[image: Inline images 1]
[image: Inline images 2]
On 17 June 2
Hi,
I am making some test about offheap memory usage and encounter an odd
behavior. My taskmanager heap limit is 12288 Mb and when i set
"taskmanager.memory.off-hep:true" for every job it allocates 11673 Mb off
heap area at most which is heapsize*0.95(value of
taskmanager.memory.fraction). But
Hi everybody,
I am trying to understand flink optimizer in DataSet api. But i dont
understant tempMode and breakPipeline properties. TempMode enum also has a
breakPipeline property but they both accessed and set in different places.
DagConnection.breakPipeline is used to select network io
Cassandra backend would be interesting especially if flink could benefit
from cassandra data locality. Cassandra/spark integration is using this for
information to schedule spark tasks.
On 9 June 2016 at 19:55, Nick Dimiduk wrote:
> You might also consider support for a
Data/wiki/14533")
env.execute()
}
}
On 7 June 2016 at 19:26, CPC <acha...@gmail.com> wrote:
> Hello everyone,
>
> When i use DataStream split/select,it always send all selected records to
> same taskmanager. Is there any reason for this behaviour? Also is it
> p
Hello everyone,
When i use DataStream split/select,it always send all selected records to
same taskmanager. Is there any reason for this behaviour? Also is it
possible to implement same split/select behaviour for DataSet api(without
using a different filter for every output )? I found this
= ...
> val split1: DataSet[A] = xs.filter(f1)
> val split2: DataSet[A] = xs.filter(f2)
>
> where f1 and f2 are true for those elements that should go into the
> first and second DataSets respectively. So far, the splits will just
> contain elements from the input DataSet, but you can
will be remotely read.
>
> Best, Fabian
>
>
> 2016-04-26 16:59 GMT+02:00 CPC <acha...@gmail.com>:
>
> > Hi,
> >
> > I look at some scheduler documentations but could not find answer to my
> > question. My question is: suppose that i have a big file on 4
Hi,
I look at some scheduler documentations but could not find answer to my
question. My question is: suppose that i have a big file on 40 node hadoop
cluster and since it is a big file every node has at least one chunk of the
file. If i write a flink job and want to filter file and if job has
Himmm i understand now. Thank you guys:)
On Apr 16, 2016 2:21 PM, "Matthias J. Sax" wrote:
> Sure. WITHOUT.
>
> Thanks. Good catch :)
>
> On 04/16/2016 01:18 PM, Ufuk Celebi wrote:
> > On Sat, Apr 16, 2016 at 1:05 PM, Matthias J. Sax
> wrote:
> >> (with the
19 matches
Mail list logo