Re: Using FlinkKinesisConsumer through a proxy

2018-12-01 Thread Tzu-Li (Gordon) Tai
Good to hear that it's working, thanks for the update!

On Sat, Dec 1, 2018, 4:29 AM Vijay Balakrishnan  Hi Gordon,
> Finally figured out my issue.Do not need to add http:// in proxyHost name.
> String proxyHost= "proxy-chaincom";//not http://proxy-chain...com
> kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
> "proxyHost", proxyHost);//<== mo http:// in proxyHost name
>
> TIA,
> Vijay
>
>
> On Wed, Nov 14, 2018 at 12:50 AM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Hi Vijay,
>>
>> I’m pretty sure that this should work with the properties that you
>> provided, unless the AWS Kinesis SDK isn’t working as expected.
>>
>> What I’ve tested is that with those properties, the ClientConfiguration
>> used to build the Kinesis client has the proxy domain / host / ports etc.
>> properly set.
>> And according to [1], this should be enough to configure the constructed
>> Kinesis client to connect via the proxy.
>>
>> Cheers,
>> Gordon
>>
>> [1]
>> https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/section-client-configuration.html
>>
>>
>> On 7 November 2018 at 1:19:02 AM, Vijay Balakrishnan (bvija...@gmail.com)
>> wrote:
>>
>> Hi Gordon,
>> This still didn't work :(
>>
>> Tried a few combinations with:
>> kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
>> "proxyDomain", "...");
>>
>> inesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
>> "proxyHost", "http://.com";);
>>
>> kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
>> "proxyPort", "911");
>>
>> kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
>> "proxyUsername", "...");
>>
>> kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
>> "proxyPassword", "..");
>>
>> kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX  +
>> "nonProxyHosts", "
>>
>>
>> How does the FlinkKinesisProducer work so seamlessly through a proxy ?
>> TIA,
>> Vijay
>>
>> On Thu, Oct 4, 2018 at 6:41 AM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> Hi,
>>>
>>> Since Flink 1.5, you should be able to set all available configurations
>>> on the ClientConfiguration through the consumer Properties (see FLINK-9188
>>> [1]).
>>>
>>> The way to do that would be to prefix the configuration you want to set
>>> with "aws.clientconfig" and add that to the properties, as such:
>>>
>>> ```
>>> Properties kinesisConsumerProps = new Properties();
>>> kinesisConsumerProps.setProperty("aws.clientconfig.proxyHost", ...);
>>> kinesisConsumerProps.setProperty("aws.clientconfig.proxyPort", ...);
>>> kinesisConsumerProps.setProperty("aws.clientconfig.proxyUsert", ...);
>>> ...
>>> ```
>>>
>>> Could you try that out and see if it works for you?
>>>
>>> I've also realized that this feature isn't documented very well, and
>>> have opened a ticket for that [2].
>>>
>>> Cheers,
>>> Gordon
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-9188
>>> [2] https://issues.apache.org/jira/browse/FLINK-10492
>>>
>>> On Thu, Oct 4, 2018, 7:57 PM Aljoscha Krettek 
>>> wrote:
>>>
 Hi,

 I'm looping in Gordon and Thomas, they might have some idea about how
 to resolve this.

 Best,
 Aljoscha

 On 3. Oct 2018, at 17:29, Vijay Balakrishnan 
 wrote:

 I have been trying with all variations  to no avail of java
 -Dhttp.nonProxyHosts=..  -Dhttps.proxyHost=http://...
 -Dhttps.proxyPort=911 -Dhttps.proxyUser= -Dhttps.proxyPassword=..
 -Dhttp.proxyHost=http://.. -Dhttp.proxyPort=911 -Dhttp.proxyUser=...
 -Dhttp.proxyPassword=... -jar .. after looking at the code in
 com.amazonaws.ClientConfiguration

 On Tue, Oct 2, 2018 at 3:49 PM Vijay Balakrishnan 
 wrote:

> HI,
> How do I use FlinkKinesisConsumer using the Properties through a proxy
> ? Getting a Connection issue through the proxy.
> Works outside the proxy.
>
> Properties kinesisConsumerConfig = new Properties();
>
> kinesisConsumerConfig.setProperty(AWSConfigConstants.AWS_REGION, region);
>
> if (local) {
>
> kinesisConsumerConfig.setProperty(AWSConfigConstants.AWS_ACCESS_KEY_ID,
> accessKey);
>
> kinesisConsumerConfig.setProperty(AWSConfigConstants.AWS_SECRET_ACCESS_KEY,
> secretKey);
> } else {
>
> kinesisConsumerConfig.setProperty(AWSConfigConstants.AWS_CREDENTIALS_PROVIDER,
> "AUTO");
> }
>
> //only for Consumer
>
> kinesisConsumerConfig.setProperty(ConsumerConfigConstants.SHARD_GETRECORDS_MAX,
> "1");
>
> kinesisConsumerConfig.setProperty(ConsumerConfigConstants.SHARD_GETRECORDS_INTERVAL_MILLIS,
> "2000");
> FlinkKinesisConsumer>
> kinesisConsumer = new FlinkKinesisConsumer<>(
> "kinesisTopicRead", new Tuple2KinesisSchema(),
> kinesisConsumerConfig);
> TIA
>




[ANNOUNCE] Call for Presentations Extended for Flink Forward San Francisco 2019

2018-12-01 Thread Fabian Hueske
Hi everyone,

We have great news for you!
We are *extending* the deadline for the Call for Presentations to *December
7, 11:59 PT*

This will give you extra time to prepare and get a chance to showcase your
work, give back to the Flink community and receive valuable feedback!
Previous events have featured speakers from AirBnb, Netflix, Uber and
others — why not be in there too?

http://flink-forward.org/sf-2019/call-for-presentations-submit-talk/

Tell your coworkers and friends so they can also share their Flink story!

Looking forward to hearing your ideas,
Fabian

Program Committee Chair
Flink Forward San Francisco

Am Di., 20. Nov. 2018 um 17:57 Uhr schrieb Fabian Hueske :

> Hi Everyone,
>
> Flink Forward San Francisco will *take place on April 1st and 2nd 2019*.
> Flink Forward is a community conference organized by data Artisans and
> gathers many members of the Flink community, including users, contributors,
> and committers. It is the perfect event to get in touch and connect with
> other stream processing enthusiasts and Flink users to exchange experiences
> and ideas.
>
> The Call for Presentations is still open but will *close soon on November
> 30th (next week Friday)*.
>
> Please submit a talk proposal, if you have an interesting Flink use case
> or experience running Flink applications in production that you would like
> to share.
>
>  You can submit a talk proposal here
> --> https://flink-forward.org/sf-2019/call-for-presentations-submit-talk/
>
> Best regards,
> Fabian
>


Re: Memory does not be released after job cancellation

2018-12-01 Thread Nastaran Motavali
Thanks for your attention,

I have streaming jobs and use RocksDB state backends. Do you mean that I don't 
need to be worry about memory management even if the allocated memory not be 
released after cancellation?



Kind regards,

Nastaran Motavalli




From: Kostas Kloudas 
Sent: Thursday, November 29, 2018 1:22:12 PM
To: Nastaran Motavali
Cc: user
Subject: Re: Memory does not be released after job cancellation

Hi Nastaran,

Can you specify what more information do you need?

>From the discussion that you posted:
1) If you have batch jobs, then Flink does its own memory management (outside 
the heap, so it is not subject to JVM's GC)
and although when you cancel the job, you do not see the memory being 
de-allocated,
this memory is available to other jobs and you do not have to worry about 
de-allocating manually.
2) if you use streaming, then you should use one of the provided state backends 
and they will do the memory management
for you (see [1] and [2]).

Cheers,
Kostas

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/state_backends.html
[2] 
https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/large_state_tuning.html

On Wed, Nov 28, 2018 at 7:11 AM Nastaran Motavali 
mailto:n.motav...@son.ir>> wrote:

Hi,
I have a simple java application uses flink 1.6.2.
When I run the jar file, I can see that the job consumes a part of the host's 
main memory. If I cancel the job, the consumed memory does not be released 
until I stop the whole cluster. How can I release the memory after cancellation?
I have followed the conversation around this issue at the mailing list 
archive[1] but still need more explanations.
[1] 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Need-help-to-understand-memory-consumption-td23821.html#a23926



Kind regards,

Nastaran Motavalli




Re: Dulicated messages in kafka sink topic using flink cancel-with-savepoint operation

2018-12-01 Thread Nastaran Motavali
Thanks for your helpful response,
Setting the consumer's 'isolation.level' property to 'read_committed' solved 
the problem!
In fact, still there is some duplicated messages in the sink topic but they are 
uncommitted and if a kafka consumer reads the messages from this sink, the 
duplicated messages have not been read so everything is OK.




Kind regards,

Nastaran Motavalli




From: Piotr Nowojski 
Sent: Thursday, November 29, 2018 3:38:38 PM
To: Nastaran Motavali
Cc: user@flink.apache.org
Subject: Re: Dulicated messages in kafka sink topic using flink 
cancel-with-savepoint operation

Hi Nastaran,

When you are checking for duplicated messages, are you reading from kafka using 
`read_commited` mode (this is not the default value)?

https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-producer-partitioning-scheme

> Semantic.EXACTLY_ONCE: uses Kafka transactions to provide exactly-once 
> semantic. Whenever you write to Kafka using
> transactions, do not forget about setting desired isolation.level 
> (read_committed or read_uncommitted - the latter one is the
> default value) for any application consuming records from Kafka.

Does the problem happens always?

Piotrek

On 28 Nov 2018, at 08:56, Nastaran Motavali 
mailto:n.motav...@son.ir>> wrote:


Hi,
I have a flink streaming job implemented via java which reads some messages 
from a kafka topic, transforms them and finally sends them to another kafka 
topic.
The version of flink is 1.6.2 and the kafka version is 011. I pass the 
Semantic.EXACTLY_ONCE parameter to the producer. The problem is that when I 
cancel the job with savepoint and then restart it using the saved savepoint, I 
have duplicated messages in the sink.
Do I miss some kafka/flink configurations to avoid duplication?


Kind regards,
Nastaran Motavalli