from:"Wesley"

Re: [ANNOUNCE] Apache Flink 1.11.0 released

2020-07-07 Thread Wesley


Nice news. Congrats!


Leonard Xu wrote:

Congratulations!

Thanks Zhijiang and Piotr for the great work, and thanks everyone involved!

Best,
Leonard Xu

Compress DataSink Output

2016-08-17 Thread Wesley Kerr

Hello -

Forgive me if this has been asked before, but I'm trying to determine the
best way to add compression to DataSink Outputs (starting with
TextOutputFormat).  Realistically I would like each partition file (based
on parallelism) to be compressed independently with gzip, but am open to
other solutions.

My first thought was to extend TextOutputFormat with a new class that
compresses after closing and before returning, but I'm not sure that would
work across all possible file systems (S3, Local, and HDFS).

Any thoughts?

Thanks!

Wes

Re: Compress DataSink Output

2016-08-19 Thread Wesley Kerr

That looks good.  Thanks!

On Fri, Aug 19, 2016 at 6:15 AM Robert Metzger  wrote:

> Hi Wes,
>
> Flink's own OutputFormats don't support compression, but we have some
> tools to use Hadoop's OutputFormats with Flink [1], and those support
> compression:
> https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html
>
> Let me know if you need more information.
>
> Regards,
> Robert
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/hadoop_compatibility.html
>
>
> On Thu, Aug 18, 2016 at 2:13 AM, Wesley Kerr 
> wrote:
>
>> Hello -
>>
>> Forgive me if this has been asked before, but I'm trying to determine the
>> best way to add compression to DataSink Outputs (starting with
>> TextOutputFormat).  Realistically I would like each partition file
>> (based on parallelism) to be compressed independently with gzip, but am
>> open to other solutions.
>>
>> My first thought was to extend TextOutputFormat with a new class that
>> compresses after closing and before returning, but I'm not sure that would
>> work across all possible file systems (S3, Local, and HDFS).
>>
>> Any thoughts?
>>
>> Thanks!
>>
>> Wes
>>
>>
>>
>

Re: flink sql row_number() over () OOM

2019-09-04 Thread Wesley Peng

on 2019/9/4 19:30, liu ze wrote:
I use the row_number() over() function to do topN, the total amount of
data is 60,000, and the state is 12G .

Finally, oom, is there any way to optimize it?

ref:
https://stackoverflow.com/questions/50812837/flink-taskmanager-out-of-memory-and-memory-configuration

The total amount of required physical and heap memory is quite difficult
to compute since it strongly depends on your user code, your job's
topology and which state backend you use.

As a rule of thumb, if you experience OOM and are still using the
FileSystemStateBackend or the MemoryStateBackend, then you should switch
to RocksDBStateBackend, because it can gracefully spill to disk if the
state grows too big.

If you are still experiencing OOM exceptions as you have described, then
you should check your user code whether it keeps references to state
objects or generates in some other way large objects which cannot be
garbage collected. If this is the case, then you should try to refactor
your code to rely on Flink's state abstraction, because with RocksDB it
can go out of core.

RocksDB itself needs native memory which adds to Flink's memory
footprint. This depends on the block cache size, indexes, bloom filters
and memtables. You can find out more about these things and how to
configure them here.

Last but not least, you should not activate
taskmanager.memory.preallocate when running streaming jobs, because
streaming jobs currently don't use managed memory. Thus, by activating
preallocation, you would allocate memory for Flink's managed memory
which is reduces the available heap space.

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Wesley Peng


On 2019/9/6 8:55 下午, Fabian Hueske wrote:

I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
Kostas is contributing to Flink for many years and puts lots of effort 
in helping our users and growing the Flink community.


Please join me in congratulating Kostas!


congratulation Kostas!

regards.

Re: [flink-1.9] how to read local json file through Flink SQL

2019-09-08 Thread Wesley Peng


On 2019/9/8 5:40 下午, Anyang Hu wrote:
In flink1.9, is there a way to read local json file in Flink SQL like 
the reading of csv file?


hi,

might this thread help you?
http://mail-archives.apache.org/mod_mbox/flink-dev/201604.mbox/%3cCAK+0a_o5=c1_p3sylrhtznqbhplexpb7jg_oq-sptre2neo...@mail.gmail.com%3e

regards.

Re: [DISCUSS] Drop older versions of Kafka Connectors (0.9, 0.10) for Flink 1.10

2019-09-11 Thread Wesley Peng





on 2019/9/11 16:17, Stephan Ewen wrote:

We still maintain connectors for Kafka 0.8 and 0.9 in Flink.
I would suggest to drop those with Flink 1.10 and start supporting only 
Kafka 0.10 onwards.


Are there any concerns about this, or still a significant number of 
users of these versions?


But we still have a large scale of deployment kafka 0.9 in production. 
Do you mean the new coming flink won't support kafka 0.9?

Though I understand for it, but just sounds sorry. :)

regards.

Re: [ANNOUNCE] Zili Chen becomes a Flink committer

2019-09-11 Thread Wesley Peng


Hi

on 2019/9/11 17:22, Till Rohrmann wrote:
I'm very happy to announce that Zili Chen (some of you might also know 
him as Tison Kun) accepted the offer of the Flink PMC to become a 
committer of the Flink project.


Congratulations Zili Chen.

regards.

Re: [ANNOUNCE] Apache Flink 1.11.0 released

Compress DataSink Output

Re: Compress DataSink Output

Re: flink sql row_number() over () OOM

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

Re: [flink-1.9] how to read local json file through Flink SQL

Re: [DISCUSS] Drop older versions of Kafka Connectors (0.9, 0.10) for Flink 1.10

Re: [ANNOUNCE] Zili Chen becomes a Flink committer

8 matches

Site Navigation

Mail list logo

Footer information