[jira] [Created] (FLINK-12789) Fix java docs in UserDefinedAggregateFunction

2019-06-09 Thread Hequn Cheng (JIRA)
Hequn Cheng created FLINK-12789:
---

 Summary: Fix java docs in UserDefinedAggregateFunction
 Key: FLINK-12789
 URL: https://issues.apache.org/jira/browse/FLINK-12789
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Reporter: Hequn Cheng
Assignee: Hequn Cheng


We use \{{UserDefinedAggregateFunction}} as the base class for 
\{{TableAggregateFunction}} and \{{AggregateFunction}}. However, the java docs 
in \{{UserDefinedAggregateFunction}} are only dedicated for 
\{{AggregateFunction}}. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12788) Add support to run a Python job-specific cluster on Kubernetes

2019-06-09 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12788:
---

 Summary: Add support to run a Python job-specific cluster on 
Kubernetes
 Key: FLINK-12788
 URL: https://issues.apache.org/jira/browse/FLINK-12788
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python, Deployment / Docker
Reporter: Dian Fu
Assignee: Dian Fu


As discussed in FLINK-12541, we need to support to run a Python job-specific 
cluster on Kubernetes. To support this, we need to improve the job specific 
docker image build scripts to support Python Table API jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12787) Allow to specify directory in option -pyfs

2019-06-09 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12787:
---

 Summary: Allow to specify directory in option -pyfs
 Key: FLINK-12787
 URL: https://issues.apache.org/jira/browse/FLINK-12787
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Dian Fu
Assignee: Dian Fu


Current only files can be specified in option `-pyfs`, we want to improve it 
allow also specify directories in option `-pyfs`. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-09 Thread vino yang
Hi all,

Thanks for all the feedback and comments.

Since the thread of this feature has been presented about one week in the
dev mailing list and has got much support from the community, I have
created a new JIRA feature issue[1] to track it and I will split subtasks
soon.

We can move further discussion under related issues.

The suggestions and opinions are still welcome and appreciated.

Best,
Vino

[1]: https://issues.apache.org/jira/browse/FLINK-12786

mayo zhang <18124766...@163.com> 于2019年6月7日周五 下午5:25写道:

>  + 1 for  this feature which is great helpful in product situations and
> look forward to see it as soon as possible.
>
> > 在 2019年6月6日,下午4:58,qianjin Xu  写道:
> >
> >>
> >> hi
> >
> > +1 nice work
> >
> > best
> >
> > forwardxu
>
>
>


[jira] [Created] (FLINK-12786) Implement local aggregation in Flink

2019-06-09 Thread vinoyang (JIRA)
vinoyang created FLINK-12786:


 Summary: Implement local aggregation in Flink
 Key: FLINK-12786
 URL: https://issues.apache.org/jira/browse/FLINK-12786
 Project: Flink
  Issue Type: New Feature
  Components: API / DataStream
Reporter: vinoyang
Assignee: vinoyang


Currently, keyed streams are widely used to perform aggregating operations 
(e.g., reduce, sum and window) on the elements that have the same key. When 
executed at runtime, the elements with the same key will be sent to and 
aggregated by the same task.
 
The performance of these aggregating operations is very sensitive to the 
distribution of keys. In the cases where the distribution of keys follows a 
powerful law, the performance will be significantly downgraded. More unluckily, 
increasing the degree of parallelism does not help when a task is overloaded by 
a single key.
 
Local aggregation is a widely-adopted method to reduce the performance degraded 
by data skew. We can decompose the aggregating operations into two phases. In 
the first phase, we aggregate the elements of the same key at the sender side 
to obtain partial results. Then at the second phase, these partial results are 
sent to receivers according to their keys and are combined to obtain the final 
result. Since the number of partial results received by each receiver is 
limited by the number of senders, the imbalance among receivers can be reduced. 
Besides, by reducing the amount of transferred data the performance can be 
further improved.

The design documentation is here: 
[https://docs.google.com/document/d/1gizbbFPVtkPZPRS8AIuH8596BmgkfEa7NRwR6n3pQes/edit?usp=sharing]

The discussion thread is here: 
[http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3CCAA_=o7dvtv8zjcxknxyoyy7y_ktvgexrvb4zhxjwzuhsulz...@mail.gmail.com%3E]

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12785) RocksDB savepoint recovery can use a lot of unmanaged memory

2019-06-09 Thread Mike Kaplinskiy (JIRA)
Mike Kaplinskiy created FLINK-12785:
---

 Summary: RocksDB savepoint recovery can use a lot of unmanaged 
memory
 Key: FLINK-12785
 URL: https://issues.apache.org/jira/browse/FLINK-12785
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Reporter: Mike Kaplinskiy


I'm running an application that's backfilling data from Kafka. There's 
approximately 3 years worth of data, with a lot of watermark skew (i.e. new 
partitions were created over time) and I'm using daily windows. This makes a 
lot of the windows buffer their contents before the watermark catches up to 
"release" them. In turn, this gives me a lot of in-flight windows (200-300) 
with very large state keys in rocksdb (on the order of 40-50mb per key).

Running the pipeline tends to be mostly fine - it's not terribly fast when 
appends happen but everything works. The problem comes when doing a savepoint 
restore - specifically, the taskmanagers eat ram until the kernel kills it due 
to being out of memory. The extra memory isn't JVM heap since the memory usage 
of the process is ~4x the -Xmx value and there aren't any {{OutOfMemoryError}} 
exceptions.

I traced the culprit of the memory growth to 
[RocksDBFullRestoreOperation.java#L212|https://github.com/apache/flink/blob/68910fa5381c8804ddbde3087a2481911ebd6d85/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/restore/RocksDBFullRestoreOperation.java#L212]
 . Specifically, while the keys/values are deserialized on the Java heap, 
{{RocksDBWriteBatchWrapper}} forwards it to RocksDB's {{WriteBatch}} which 
buffers in  unmanaged memory. That's not in itself an issue, but 
{{RocksDBWriteBatchWrapper}} flushes only based on a number of records - not a 
number of bytes in-flight. Specifically, {{RocksDBWriteBatchWrapper}} will 
flush only once it has 500 records, and at 40mb per key, that's at least 20Gb 
of managed memory before a flush.

My suggestion would be to add an additional flush criteria to 
{{RocksDBWriteBatchWrapper}} - one based on {{batch.getDataSize()}} (e.g. 500 
records or 5mb buffered). This way large key writes would be immediately 
flushed to RocksDB on recovery or even writes. I applied this approach and I 
was able to complete a savepoint restore for my jon. That said, I'm not 
entirely sure what else this change would impact since I'm not very familiar 
with Flink.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


I want to contribute to Apache Flink

2019-06-09 Thread ruiyin chen
Hi, I want to contribute to Apache Flink. Would you please give me the
contributor permission? My JIRA ID is nanoleak.


Re: [DISCUSS] FLIP-33: Standardize connector metrics

2019-06-09 Thread Becket Qin
Hi Piotr,

Thanks for the comments. Yes, you are right. Users will have to look at
other metrics to decide whether the pipeline is healthy or not in the first
place before they can use the time-based metric to fix the bottleneck.

I agree that once we have FLIP-27 ready, some of the metrics can just be
reported by the abstract implementation.

I've updated FLIP-33 wiki page to add the pendingBytes and pendingRecords
metric. Please let me know if you have any concern over the updated metric
convention proposal.

@Chesnay Schepler  @Stephan Ewen
 will
you also have time to take a look at the proposed metric convention? If
there is no further concern I'll start a voting thread for this FLIP in two
days.

Thanks,

Jiangjie (Becket) Qin



On Wed, Jun 5, 2019 at 6:54 PM Piotr Nowojski  wrote:

> Hi Becket,
>
> Thanks for the answer :)
>
> > By time-based metric, I meant the portion of time spent on producing the
> > record to downstream. For example, a source connector can report that
> it's
> > spending 80% of time to emit record to downstream processing pipeline. In
> > another case, a sink connector may report that its spending 30% of time
> > producing the records to the external system.
> >
> > This is in some sense equivalent to the buffer usage metric:
>
> >   - 80% of time spent on emitting records to downstream ---> downstream
> > node is bottleneck ---> output buffer is probably full.
> >   - 30% of time spent on emitting records to downstream ---> downstream
> > node is not bottleneck ---> output buffer is probably not full.
>
> If by “time spent on emitting records to downstream” you understand
> “waiting on back pressure”, then I see your point. And I agree that some
> kind of ratio/time based metric gives you more information. However under
> “time spent on emitting records to downstream” might be hidden the
> following (extreme) situation:
>
> 1. Job is barely able to handle influx of records, there is 99%
> CPU/resource usage in the cluster, but nobody is
> bottlenecked/backpressured, all output buffers are empty, everybody is
> waiting in 1% of it’s time for more records to process.
> 2. 80% time can still be spent on "down stream operators”, because they
> are the CPU intensive operations, but this doesn’t mean that increasing the
> parallelism down the stream will help with anything there. To the contrary,
> increasing parallelism of the source operator might help to increase
> resource utilisation up to 100%.
>
> However, this “time based/ratio” approach can be extended to in/output
> buffer usage. Besides collecting an information that input/output buffer is
> full/empty, we can probe profile how often are buffer empty/full. If output
> buffer is full 1% of times, there is almost no back pressure. If it’s full
> 80% of times, there is some back pressure, if it’s full 99.9% of times,
> there is huge back pressure.
>
> Now for autoscaling you could compare the input & output buffers fill
> ratio:
>
> 1. Both are high, the source of bottleneck is down the stream
> 2. Output is low, input is high, this is the bottleneck and the higher the
> difference, the bigger source of bottleneck is this is operator/task
> 3. Output is high, input is low - there was some load spike that we are
> currently finishing to process
>
>
>
> But long story short, we are probably diverging from the topic of this
> discussion, and we can discuss this at some later point.
>
> For now, for sources:
>
> as I wrote before, +1 for:
>  - pending.bytes, Gauge
>  - pending.messages, Gauge
>
> When we will be developing/discussing SourceReader from FLIP-27 we might
> then add:
>
> - in-memory.buffer.usage (0 - 100%)
>
> Which will be estimated automatically by Flink while user will be able to
> override/provide better estimation.
>
> Piotrek
>
> > On 5 Jun 2019, at 05:42, Becket Qin  wrote:
> >
> > Hi Piotr,
> >
> > Thanks for the explanation. Please see some clarifications below.
> >
> > By time-based metric, I meant the portion of time spent on producing the
> > record to downstream. For example, a source connector can report that
> it's
> > spending 80% of time to emit record to downstream processing pipeline. In
> > another case, a sink connector may report that its spending 30% of time
> > producing the records to the external system.
> >
> > This is in some sense equivalent to the buffer usage metric:
> >   - 80% of time spent on emitting records to downstream ---> downstream
> > node is bottleneck ---> output buffer is probably full.
> >   - 30% of time spent on emitting records to downstream ---> downstream
> > node is not bottleneck ---> output buffer is probably not full.
> >
> > However, the time-based metric has a few advantages that the buffer usage
> > metric may not have.
> >
> > 1.  Buffer usage metric may not be applicable to all the connector
> > implementations, while reporting time-based metric are always doable.
> > Some source connectors may not have any input buffer, or they may use
> some
> > third 

Re: [DISCUSS] Releasing Flink 1.8.1

2019-06-09 Thread jincheng sun
Hi all,
I am here to quickly update the progress of the issue that needs to be
tracked(going well):

[Blocker]
- FLINK-12297   Work by @
Aitozi  Being reviewed by @Aljoscha Krettek
!
- FLINK-11107   Work by @
Myasuka  Being reviewed by @Tzu-Li (Gordon) Tai


[Critical]
- FLINK-11059  Work by @
shuai-xu  Being reviewed by @Till Rohrmann
!

[Nice to have]
- FLINK-10455   Work by @
becketqin Need someone to volunteer review the PR

The detail can be found here:
https://docs.google.com/document/d/1858C7HdyDPIxxm2Rvnu4bYahq0Tr9NaY1mj7BzQi_0w/edit?usp=sharing

Great thanks to all of you for the help(fix or review) in promoting the
1.8.1 release. Thank you!!!

I hope to prepare the first RC of release 1.8.1 on Thursday, and FLINK-12297
, FLINK-11107
, FLINK-11059
  should be merged
before the RC1.
If the relevant PR can't be Merged, please let me know, and we will put
more energy into solving! :)

Best,
Jincheng

jincheng sun  于2019年6月5日周三 下午5:33写道:

> I am here to quickly update the progress of the issue that needs to be
> tracked(going well):
>
> [Blocker]
> - FLINK-12296  [done]
> - FLINK-11987  [done]
> - FLINK-12297   Being
> reviewed by @Aljoscha Krettek !
> - FLINK-11107   Being
> reviewed by @Tzu-Li (Gordon) Tai 
>
> [Critical]
> - FLINK-10455   Will
> open the PR soon, great job @Jiangjie Qin
>  !
> - FLINK-11059   Being
> reviewed by @Till Rohrmann !
>
> [Nice to have]
> - FLINK-12544  [done]
>
> The detail can be found here:
>
> https://docs.google.com/document/d/1858C7HdyDPIxxm2Rvnu4bYahq0Tr9NaY1mj7BzQi_0w/edit?usp=sharing
>
> Great thanks to all of you for the help(fix or review) in promoting the
> 1.8.1 release. Thank you!!!
>
> BTW: That's great if we can fix all of those issues, and prepare the first
> RC of release 1.8.1 next week. :)
>
> Best,
> Jincheng
>
> jincheng sun  于2019年6月3日周一 下午1:22写道:
>
>> I am here to quickly update the progress of the issue that needs to be
>> tracked:
>>
>> [Blocker]
>> - FLINK-12296  [done]
>> - FLINK-11987  [done]
>> - FLINK-12297   @Aljoscha
>> Krettek  will review the PR!
>> - FLINK-11107  @Tzu-Li
>> (Gordon) Tai  will help to review! (This is a new
>> captur)
>>
>> [Critical]
>> - FLINK-10455   @Jiangjie
>> Qin
>>  will
>> help to take the ticket!
>> - FLINK-11059   Being
>> reviewed by @Till Rohrmann !
>>
>> [Nice to have]
>> - FLINK-12544  Being
>> reviewed by @Piotr Nowojski 
>>
>> The detail can be found here:
>>
>> https://docs.google.com/document/d/1858C7HdyDPIxxm2Rvnu4bYahq0Tr9NaY1mj7BzQi_0w/edit?usp=sharing
>>
>> Great thanks to all of you for the help(fix or review) in promoting the
>> 1.8.1 release. Thank you!!!
>>
>> BTW: That's great if we can fix all of those issues, and prepare the
>> first RC of release 1.8.1 next week. :)
>>
>> Best,
>> Jincheng
>>
>>
>> Jark Wu  于2019年5月30日周四 下午9:49写道:
>>
>>> Hi Jingcheng,
>>>
>>> Thanks for coordinating the work to release 1.8.1.
>>>
>>> +1 for 1.8.1
>>>
>>> On Wed, 29 May 2019 at 19:48, Hequn Cheng  wrote:
>>>
>>> > Hi Jincheng,
>>> >
>>> > Thanks for putting these together with a nice document.
>>> > +1 to release 1.8.1. I think it would be nice if we can have a new
>>> release
>>> > with so many fixes.
>>> >
>>> > Best, Hequn
>>> >
>>> > On Wed, May 29, 2019 at 5:25 PM jincheng sun >> >
>>> > wrote:
>>> >
>>> > > Hi all,
>>> > > Thank you for your support of the release of 1.8.1.
>>> > >
>>> > > @Till Rohrmann   Thank you very much for your
>>> help
>>> > > review FLINK-11059!
>>> > > @Zhijiang Thank you feedback the very important bug fix. I'll add it
>>> to
>>> > the
>>> > > trace list!
>>> > > @Tzu-Li (Gordon) Tai  Great thanks for your can
>>> > > kindly
>>> > > help for the final stage for the flink 1.8.1 release!
>>> > > 

Join to Flink Community

2019-06-09 Thread Jasper Yue
Hi Guys,

I want to contribute to Apache Flink.

Would you please give me the permission as a contributor?

My JIRA ID is yuetongshu


Regards,
Jasper Yue


[jira] [Created] (FLINK-12784) Support retention policy for InfluxDB metrics reporter

2019-06-09 Thread Mans Singh (JIRA)
Mans Singh created FLINK-12784:
--

 Summary: Support retention policy for InfluxDB metrics reporter
 Key: FLINK-12784
 URL: https://issues.apache.org/jira/browse/FLINK-12784
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Affects Versions: 1.8.0
Reporter: Mans Singh
Assignee: Mans Singh


InfluxDB metrics reporter uses default retention policy for saving metrics to 
InfluxDB.  This enhancement will allow user to specify retention policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)