Re: Rich and incrementally aggregating window functions

2019-05-08 Thread Hequn Cheng
Hi, There is a discussion about this before, you can take a look at it[1]. Best, Hequn [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-implementation-of-aggregate-function-using-a-ProcessFunction-td23473.html#a23531 On Thu, May 9, 2019 at 5:14 AM an0 wrote: >

Re: Approach to Auto Scaling Flink Job

2019-05-08 Thread Rong Rong
Hi Anil, We have a presentation[1] that briefly discuss the higher level of the approach (via watchdog) in FlinkForward 2018. We are also restructuring the approach of our open-source AthenaX: Right now our internal implementation has diverged from the open-source for too long, it has been a

Re: Read data from HDFS on Hadoop3

2019-05-08 Thread Soheil Pourbafrani
UPDATE I noticed that it runs using the IntelliJ IDEA but packaging the fat jar and deploying on the cluster will cause the so-called hdfs scheme error! On Thu, May 9, 2019 at 2:43 AM Soheil Pourbafrani wrote: > Hi, > > I used to read data from HDFS on Hadoop2 by adding the following >

Re: Reconstruct object through partial select query

2019-05-08 Thread shkob1
Just to be more clear on my goal - Im trying to enrich the incoming stream with some meaningful tags based on conditions from the event itself. So the input stream could be an event looks like: Class Car { int year; String modelName; } i will have a config that are defining tags as:

Reconstruct object through partial select query

2019-05-08 Thread shkob1
Hey, I'm trying to create a SQL query which, given input from a stream with generic class T type will create a new stream which will be in the structure of { origin : T resultOfSomeSQLCalc : Array[String] } it seems that just by doing "SELECT *" i can convert the resulting table back to a

Read data from HDFS on Hadoop3

2019-05-08 Thread Soheil Pourbafrani
Hi, I used to read data from HDFS on Hadoop2 by adding the following dependencies: org.apache.flink flink-java 1.4.0 org.apache.flink flink-streaming-java_2.11 1.4.0

Rich and incrementally aggregating window functions

2019-05-08 Thread an0
I want to use ProcessWindowFunction.Context#globalState in my window function. But I don't want to apply ProcessWindowFunction directly to my WindowedStream because I don't want to buffer all the elements of each window. Currently I'm using WindowedStream#aggregate(AggregateFunction,

Re: Migration from flink 1.7.2 to 1.8.0

2019-05-08 Thread Farouk
Hi Till Thanks. I'll check it out. Farouk Garanti sans virus. www.avg.com

Re: Flink on YARN: TaskManager heap auto-sizing?

2019-05-08 Thread Dylan Adams
Till, Thanks for the pointer to the code. Regards, Dylan On Wed, May 8, 2019 at 11:18 AM Till Rohrmann wrote: > Hi Dylan, > > the container's memory will be calculated here [1]. In the case of Yarn, > the user specifies the container memory size and based on this Flink > calculates with how

Re: I want to use MapState on an unkeyed stream

2019-05-08 Thread an0
I switched to using operator list state. It is more clear. It is also supported by RocksDBKeyedStateBackend, isn't it? On 2019/05/08 14:42:36, Till Rohrmann wrote: > Hi, > > if you want to increase the parallelism you could also pick a key randomly > from a set of keys. The price you would

Re: Approach to Auto Scaling Flink Job

2019-05-08 Thread Anil
Thanks for the reply Rong. Can you please let me know the design for the auto-scaling part, if possible. Or guide me in the direction so that I could create this feature myself. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Testing with ProcessingTime and FromElementsFunction with TestEnvironment

2019-05-08 Thread Steven Nelson
That’s what I figured was happening :( Your explanation is a lot better than what I gave to my team, so that will help a lot, thank you! Is there a testing source already created that does this sort of thing? The Flink-testing library seems a bit sparse. -Steve Sent from my iPhone > On May

Re: Inconsistent documentation of Table Conditional functions

2019-05-08 Thread Rong Rong
Hi Flavio, I believe the documentation meant "X" as a placeholder, where you can convert "X" into the numeric values (1, 2, ...) depends on how many "CASE WHEN" conditions you have. *"resultZ" *is the default result in the "ELSE" statement, and thus it is a literal. Thanks, Rong On Wed, May 8,

Re: Inconsistent documentation of Table Conditional functions

2019-05-08 Thread Xingcan Cui
Hi Flavio, In the description, resultX is just an identifier for the result of the first meeting condition. Best, Xingcan > On May 8, 2019, at 12:02 PM, Flavio Pompermaier wrote: > > Hi to all, > in the documentation of the Table Conditional functions [1] the example is > inconsistent with

Inconsistent documentation of Table Conditional functions

2019-05-08 Thread Flavio Pompermaier
Hi to all, in the documentation of the Table Conditional functions [1] the example is inconsistent with the related description (there's no resultX for example). Or am I wrong? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#conditional-functions

Re: Approach to Auto Scaling Flink Job

2019-05-08 Thread Rong Rong
Hi Anil, Thanks for reporting the issue. I went through the code and I believe the auto-scaling functionality is still in our internal branch and has not been merged to the open-source branch yet. I will change the documentation accordingly. Thanks, Rong On Mon, May 6, 2019 at 9:54 PM Anil

Re: taskmanager.tmp.dirs vs io.tmp.dirs

2019-05-08 Thread Flavio Pompermaier
Great, thanks Till! On Wed, May 8, 2019 at 4:20 PM Till Rohrmann wrote: > Hi Flavio, > > taskmanager.tmp.dirs is the deprecated configuration key which has been > superseded by the io.tmp.dirs configuration option. In the future, you > should use io.tmp.dirs. > > Cheers, > Till > > On Wed, May

Re: Flink on YARN: TaskManager heap auto-sizing?

2019-05-08 Thread Till Rohrmann
Hi Dylan, the container's memory will be calculated here [1]. In the case of Yarn, the user specifies the container memory size and based on this Flink calculates with how much heap memory the JVM is started (container memory size - off heap memory - cut off memory). [1]

Re: flink 1.7 HA production setup going down completely

2019-05-08 Thread Till Rohrmann
Hi Manju, I guess this exception Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1651346363-10.20.1.81-1525354906737:blk_1083182315_9441494 file=/flink/checkpoints/submittedJobGraph480ddf9572ed at

Re: flink 1.7 HA production setup going down completely

2019-05-08 Thread Manjusha Vuyyuru
Hi Till, Thanks for the response. please see the attached log file. *HA config is : * high-availability: zookeeper high-availability.storageDir: hdfs://flink-hdfs:9000/flink/checkpoints >From the logs i can see block missing exceptions from hdfs, but i can see that the jobgraph is still present

Re: Class loader premature closure - NoClassDefFoundError: org/apache/kafka/clients/NetworkClient

2019-05-08 Thread Till Rohrmann
Thanks for reporting this issue Chris. It looks indeed as if FLINK-10455 has not been fully fixed. I've reopened it and linked this mailing list thread. If you want, then you could write to the JIRA thread as well. What would be super helpful is if you manage to create a reproducing example for

Re: Migration from flink 1.7.2 to 1.8.0

2019-05-08 Thread Till Rohrmann
Hi Farouk, from the stack trace alone I cannot say much. Would it be possible to share a minimal example which reproduces the problem? My suspicion is that OperatorChain.java:294 produces a null value. Differently, said that somehow there is no StreamConfig registered for the given outputId.

Re: I want to use MapState on an unkeyed stream

2019-05-08 Thread Till Rohrmann
Hi, if you want to increase the parallelism you could also pick a key randomly from a set of keys. The price you would pay is a shuffle operation (network I/O) which would not be needed if you were using the unkeyed stream and used the operator list state. However, with keyed state you could

Re: Getting async function call terminated with an exception

2019-05-08 Thread Till Rohrmann
Hi Avi, you need to complete the given resultFuture and not return a future. You can do this via resultFuture.complete(r). Cheers, Till On Tue, May 7, 2019 at 8:30 PM Avi Levi wrote: > Hi, > We are using flink 1.8.0 (but the flowing also happens in 1.7.2) I tried > very simple unordered async

Re: Testing with ProcessingTime and FromElementsFunction with TestEnvironment

2019-05-08 Thread Till Rohrmann
Hi Steve, I think the reason for the different behaviour is due to the way event time and processing time are implemented. When you are using event time, watermarks need to travel through the topology denoting the current event time. When you source terminates, the system will send a watermark

Re: flink 1.7 HA production setup going down completely

2019-05-08 Thread Till Rohrmann
Hi Manju, could you share the full logs or at least the full stack trace of the exception with us? I suspect that after a failover Flink tries to restore the JobGraph from persistent storage (the directory which you have configured via `high-availability.storageDir`) but is not able to do so.

Re: taskmanager.tmp.dirs vs io.tmp.dirs

2019-05-08 Thread Till Rohrmann
Hi Flavio, taskmanager.tmp.dirs is the deprecated configuration key which has been superseded by the io.tmp.dirs configuration option. In the future, you should use io.tmp.dirs. Cheers, Till On Wed, May 8, 2019 at 3:32 PM Flavio Pompermaier wrote: > Hi to all, > looking at >

taskmanager.tmp.dirs vs io.tmp.dirs

2019-05-08 Thread Flavio Pompermaier
Hi to all, looking at https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html it's not very clear to me the difference between these two settings (actually I always used the same value for the two). My understanding is that taskmanager.tmp.dirs is used to spill memory when there's

Re: flink 1.7 HA production setup going down completely

2019-05-08 Thread Manjusha Vuyyuru
Any update on this from community side? On Tue, May 7, 2019 at 6:43 PM Manjusha Vuyyuru wrote: > im using 1.7.2. > > > On Tue, May 7, 2019 at 5:50 PM miki haiat wrote: > >> Which flink version are you using? >> I had similar issues with 1.5.x >> >> On Tue, May 7, 2019 at 2:49 PM Manjusha

Re: Unable to build flink from source

2019-05-08 Thread Chesnay Schepler
You're likely using Java9+, but 1.3.3 only supports Java 8 (and maybe still 7). On 06/05/2019 03:20, syed wrote: Hi I am trying to build flink 1.3.3 from source using IntelliJ IDEA Ultimate 2019.1 IDE. When I build the project, I am receiving the following error *java package sun.misc does