Path Style Access for S3 compliant object stores

2019-08-13 Thread Achyuth Narayan Samudrala
Hi user group, I am trying to use Flink to write to an S3 object store. I am using the flink-s3-fs-hadoop as a filesystem implementation to interact with this store. How do I provide the s3 property to enable path style access instead of virtual host addressing? I tried looking around in the docu

Kafka producer failed with InvalidTxnStateException when performing commit transaction

2019-08-13 Thread Tony Wei
Hi, Currently, I was trying to update our kafka cluster with larger ` transaction.max.timeout.ms`. The original setting is kafka's default value (i.e. 15 minutes) and I tried to set as 3 hours. When I was doing rolling-restart for my brokers, this exception came to me on the next checkpoint after

Fwd: Flink program,Full GC (System.gc())

2019-08-13 Thread Andrew Lin
Hi Xintong, Thanks for your answer! I also think that is not a big problem because it’ takes less than 0.5 second。I only want to find what was caused. "JVM also does that automatically, as long as there are continuous activities of creating / destroying objects in heap” I also find some an

Re: Implementing a low level join

2019-08-13 Thread Hequn Cheng
Hi Felipe, > I want to implement a join operator which can use different strategies for joining tuples. Not all kinds of join strategies can be applied to streaming jobs. Take sort-merge join as an example, it's impossible to sort an unbounded data. However, you can perform a window join and use t

Re: Making broadcast state queryable?

2019-08-13 Thread Oytun Tez
Thank you for the honest response, Yu! There is so much that comes to mind when we look at Flink as a "application framework" (my talk in Flink Forward in Berlin will be about this

Re: Making broadcast state queryable?

2019-08-13 Thread Yu Li
Hi Oytun, Sorry but TBH such support will probably not be added in the foreseeable future due to lack of committer bandwidth (not only support queryable broadcast state but all about QueryableState module) as pointed out in other threads [1] [2]. However, I think you could open a JIRA for this so

Making broadcast state queryable?

2019-08-13 Thread Oytun Tez
Hi there, Can we set a broadcast state as queryable? I've looked around, not much to find about it. I am receiving UnknownKvStateLocation when I try to query with the descriptor/state name I give to the broadcast state. If it doesn't work, what could be the alternative? My mind goes around ctx.ge

Re: Flink 1.8: Using the RocksDB state backend causes "NoSuchMethodError" when trying to stop a pipeline

2019-08-13 Thread Yun Tang
Hi Tobias First of all, I think you would not need to ADD the flink-statebackend-rocksdb jar package into your docker image's lib folder, as the flink-dist jar package within lib folder already include all classes of flink-statebackend-rocksdb. I think the root cause is that you might assemble

Implementing a low level join

2019-08-13 Thread Felipe Gutierrez
Hi all, I want to implement a join operator which can use different strategies for joining tuples. I saw that with CoProcessFunction I am able to implement low-level joins [1]. However, I do know how to decide between different algorithms to join my tuples. On the other hand, to do a broadcast jo

Re: Scylla connector

2019-08-13 Thread Elias Levy
Scylla is protocol compatible with Cassandra, so you can just use the Cassandra connector. Scylla has extended the Go gocql package to make it shard aware, but such an extension does not exist for the Cassandra Java driver. That just means that the driver will sent requests to any shard on a node,

Re: Flink program,Full GC (System.gc())

2019-08-13 Thread Xintong Song
Hi Andrew, The behavior you described doesn't looks like a problem to me. I mean what are the bad consequences for having a full GC (which takes less than 0.5 second) per hour? The full GC is not necessarily triggered by explicitly calling "System.gc()" in Flink. JVM also does that automatically,

Re: Apache flink 1.7.2 security issues

2019-08-13 Thread Timothy Victor
The flink job manager UI isn't meant to be accessed from outside a firewall I think. Plus I dont think it was designed with security in mind and honestly it doesn't need to in my opinion. If you need security then address your network setup. And if it is still a problem the just turn off the U

Flink 1.8: Using the RocksDB state backend causes "NoSuchMethodError" when trying to stop a pipeline

2019-08-13 Thread Kaymak, Tobias
Hi, I am using Apache Beam 2.14.0 with Flink 1.8.0 and I have included the RocksDb dependency in my projects pom.xml as well as baked it into the Dockerfile like this: FROM flink:1.8.0-scala_2.11 ADD --chown=flink:flink http://central.maven.org/maven2/org/apache/flink/flink-statebackend-rocksdb_

Re: Apache flink 1.7.2 security issues

2019-08-13 Thread Fabian Hueske
Thanks for reporting this issue. It is already discussed on Flink's dev mailing list in this thread: -> https://lists.apache.org/thread.html/10f0f3aefd51444d1198c65f44ffdf2d78ca3359423dbc1c168c9731@%3Cdev.flink.apache.org%3E Please continue the discussion there. Thanks, Fabian Am Di., 13. Aug.

Re: I'm not able to make a stream-stream Time windows JOIN in Flink SQL

2019-08-13 Thread Zhenghua Gao
I wrote a demo example for time windowed join which you can pick up [1] [1] https://gist.github.com/docete/8e78ff8b5d0df69f60dda547780101f1 *Best Regards,* *Zhenghua Gao* On Tue, Aug 13, 2019 at 4:13 PM Zhenghua Gao wrote: > You can check the plan after optimize to verify it's a regular join

Apache flink 1.7.2 security issues

2019-08-13 Thread V N, Suchithra (Nokia - IN/Bangalore)
Hello, We are using Apache Flink 1.7.2 version. During our security scans following issues are reported by our scan tool. Please let us know your comments on these issues. [1] 150085 Slow HTTP POST vulnerability Severity Potential Vulnerability - Level 3 Group Information Disclosure Threat The

Flink program,Full GC (System.gc())

2019-08-13 Thread Andrew Lin
Flink Version: 1.8.1 deploy:standalone state.backend.fs.memory-threshold=128k A very very simple flink program and without other jar dependended; But trigger full gc every hour by Full GC (System.gc() in jobmanager Jobmanager I only find this where called System.gc(),but not sure when w

Flink program,Full GC (System.gc())

2019-08-13 Thread Andrew Lin
Flink Version: 1.8.1 deploy:standalone state.backend.fs.memory-threshold=128k A very very simple flink program and without other jar dependended; But trigger full gc every hour by Full GC (System.gc() in jobmanager Jobmanager I only find this where called System.gc(),but not sure when w

Re: Apache flink 1.7.2 security issues

2019-08-13 Thread Stephan Ewen
Hi! Thank you for reporting this! At the moment, the Flink REST endpoint is not secure in the way that you can expose it publicly. After all, you can submit Flink jobs to it which by definition support executing arbitrary code. Given that access to the REST endpoint allows by design arbitrary cod

Re: Testing with ProcessingTime and FromElementsFunction with TestEnvironment

2019-08-13 Thread Till Rohrmann
Hi Michal, you need to implement a source which does not terminate. Take a look at the InifiteSource [1] which does exactly this. That way there won't be a Long.MAX_VALUE being sent when closing the source operator. [1] https://github.com/apache/flink/blob/master/flink-streaming-java/src/test/jav

Re: [External] Re: From Kafka Stream to Flink

2019-08-13 Thread Casado Tejedor , Rubén
Hi Do you have an expected version of Flink to include the capability to ingest an upsert stream as a dynamic table? We have such need in our current project. What we have done is to emulate such behavior working at low level with states (e.g. update existing value if key exists, create a new v

Re: I'm not able to make a stream-stream Time windows JOIN in Flink SQL

2019-08-13 Thread Zhenghua Gao
You can check the plan after optimize to verify it's a regular join or time-bounded join(Should have a WindowJoin). The most direct way is breakpoint at optimizing phase [1][2]. And you can use your TestData and create an ITCase for debugging [3] [1] https://github.com/apache/flink/blob/master/fl