Re: How to write value only using flink's SequenceFileWriter?

2019-07-26 Thread Liu Bo
The file header says key is NullWritable: SEQ^F!org.apache.hadoop.io.NullWritable^Yorg.apache.hadoop.io.Text^A^A)org.apache.hadoop.io.compress.SnappyCodec Might be a hadoop -text problem? On Sat, 27 Jul 2019 at 11:07, Liu Bo wrote: > Dear flink users, > > We're trying to switch from

How to write value only using flink's SequenceFileWriter?

2019-07-26 Thread Liu Bo
Dear flink users, We're trying to switch from StringWriter to SequenceFileWriter to turn on compression. StringWriter writes value only and we want to keep that way. AFAIK, you can use NullWritable in Hadoop writers to escape key so you only write the values. So I tried with NullWritable as

Savepoint process recovery in Jobmanager HA setup

2019-07-26 Thread Bajaj, Abhinav
Hi, I am trying to test a scenario that triggers a savepoint on a Flink 1.7.1 Job deployed with jobmanager HA mode. The purpose is to check if savepoint process recovers if the leader jobmanager fails during the savepoint. During my testing, I found that the new leader jobmanager returns the

Re: Is it possible to decide the order of where conditions in Flink SQL

2019-07-26 Thread sri hari kali charan Tummala
try cte common table expressions if it supports or sql subquery. On Fri, Jul 26, 2019 at 1:00 PM Fanbin Bu wrote: > how about move query db filter to the outer select. > > On Fri, Jul 26, 2019 at 9:31 AM Tony Wei wrote: > >> Hi, >> >> If I have multiple where conditions in my SQL, is it

async and checkpointing

2019-07-26 Thread anurag
Hi , Thanks in advance for your help. I am trying to write a flink function which reads from kafka using kafka-flinkconsumer and sends messages to an indexer. I am not clear on how async and checkpointing will work in this case. My flow is like this: a) Messages are ingested into kafka. b)The

Re: Is it possible to decide the order of where conditions in Flink SQL

2019-07-26 Thread Fanbin Bu
how about move query db filter to the outer select. On Fri, Jul 26, 2019 at 9:31 AM Tony Wei wrote: > Hi, > > If I have multiple where conditions in my SQL, is it possible to specify > its order, so that the query > can be executed more efficiently? > > For example, if I have the following SQL,

請問在 Flink SQL 上能不能指定 WHERE 裡的判斷式的執行順序?

2019-07-26 Thread Tony Wei
Hi, 想請問是否有辦法在 Flink SQL 上指明 WHERE 裡的判斷式的執行順序,來做到一些特定情況下的 查詢優化? 舉例來說,在下面的 SQL,假如有個很耗時的 UDF 需要每次都去查詢資料庫。在這樣的狀況下, 如果可以確保優先執行 `!user.is_robot` 的判斷,再去執行後面的 UDF 的話,就能減少許多的資料 庫查詢。因為那些在 `user.is_robot` 裡得到 `true` 的數據就可以提早被丟棄,而不用去執行後面 較為花費時間的 UDF 了。 select * from users where !user.is_robot and

Is it possible to decide the order of where conditions in Flink SQL

2019-07-26 Thread Tony Wei
Hi, If I have multiple where conditions in my SQL, is it possible to specify its order, so that the query can be executed more efficiently? For example, if I have the following SQL, it used a heavy UDF that needs to access database. However, if I can specify the order of conditions is executing

Re: question for handling db data

2019-07-26 Thread Oytun Tez
imagine an operator, ProcessFunction, it has 2 incoming data: geofences via broadcast, user location via normal data stream geofence updates and user location updates will come separately into this single operator. 1) when geofence update comes via broadcast, the operator will update its state

Re: Extending REST API with new endpoints

2019-07-26 Thread Oytun Tez
Scary! :) I would heartily hate to maintain our own fork. Should I make a feature request to discuss further and then send a PR for this? Is this the normal way to push for a feature? --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com —

Pramaters in eclipse with Flink

2019-07-26 Thread alaa
Hallo I run this example form GitHub https://github.com/ScaleUnlimited/flink-streaming-kmeans but I am not familiar with eclipse and i got this error I dont know how

Re: Extending REST API with new endpoints

2019-07-26 Thread Chesnay Schepler
There's no built-in way to extend the REST API. You will have to create a fork and either extend the DIspatcherRestEndpoint (or parent classes), or implement another WebMonitorExtension and modify the DispatcherRestEndpoint to load that one as well. On 23/07/2019 15:51, Oytun Tez wrote:

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Richard Deurwaarder
Hello, We run into the same problem. We've done most of the same steps/observations: - increase memory - increase cpu - No noticable increase in GC activity - Little network io Our current setup has the liveliness probe disabled and we've increased (akka)timeouts, this seems to help

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Biao Liu
Hi Prakhar, Sorry I don't have much experience on k8s. Maybe some other guys could help. On Fri, Jul 26, 2019 at 6:20 PM Prakhar Mathur wrote: > Hi, > > So we were deploying our flink clusters on YARN earlier but then we moved > to kubernetes, but then our clusters were not this big. Have you

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Prakhar Mathur
Hi, So we were deploying our flink clusters on YARN earlier but then we moved to kubernetes, but then our clusters were not this big. Have you guys seen issues with job manager rest server becoming irresponsive on kubernetes before? On Fri, Jul 26, 2019, 14:28 Biao Liu wrote: > Hi Prakhar, > >

Re: Help with the correct Event Pattern

2019-07-26 Thread Dawid Wysakowicz
Have you tried pattern like: /Pattern.begin[Event]("b", //AfterMatchSkipStrategy.skipPastLast//).where(...).followedBy("c").where(...).followedBy("e").where(...)/ The method followedBy(Pattern) constructs a Pattern with a subGroup pattern. The skip strategy there does not have any effect. Best,

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Biao Liu
Hi Prakhar, Sorry I could not find any abnormal message from your GC log and stack trace. Have you ever tried deploying the cluster in other ways? Not on Kubernetes. Like on YARN or standalone. Just for narrowing down the scope. On Tue, Jul 23, 2019 at 12:34 PM Prakhar Mathur wrote: > > On

?????? Re: flink kafka???????????????? ???? taskmanager ????

2019-07-26 Thread ????
jarflinkflink -- -- ??: "rockey...@163.com"; : 2019??7??26??(??) 10:22 ??: "user-zh"; : Re: Re: flink kafka taskmanager

RestClusterClient

2019-07-26 Thread somnussuy
您好,flink集群关闭的情况下,运行任务会有报错信息 Could not retrieve the execution result,但是在 flink 集群正常运行的情况下,偶然会报 Could not retrieve the execution result,通过查询了解到,flink 通过 RestClusterClient类 将任务提交至 jobmanager,如果 detached 为 false,会采用 CompletableFuture 的 thenCompose 方法,在获取结果时,会有异常的捕获,如下: final CompletableFuture

Re: Execution environments for testing: local vs collection vs mini cluster

2019-07-26 Thread Biao Liu
Hi Juan, Sorry for the late reply. 1. the environments of data stream and data set are not same. An obvious difference is there always be a "stream" prefix of environment for data stream. For example, StreamExecutionEnvironment is for data stream, ExecutionEnvironment and CollectionEnvironment

Re: MapSate within Aggregate function

2019-07-26 Thread Ahmad Hassan
Hi Congzian, My understanding is that if I use AggregateFunction and have Million of unique elements coming in for the duration of 24hour, then the state of AggregateFunction will grow huge with those million entries and the checkpointing would take longer and longer. I thought if i could use

Re: Assign a Row.of(ListsElements) exception

2019-07-26 Thread Caizhi Weng
Hi Andres, This exception is often caused by other exceptions. Please post your full stack trace here so we can diagnose the problem. Thanks. Andres Angel 于2019年7月26日周五 上午11:14写道: > Hello everyone, > > I have a list with bunch of elements and I need create a Row.of() based on > the whole

Event time window eviction

2019-07-26 Thread Navneeth Krishnan
Hi All, I'm working on a very short tumbling window for 1 second per key. What I want to achieve is if the event time per key doesn't progress after a second I want to evict the window, basically a combination of event time and processing time. I'm currently achieving it by registering a