Re: Converting String/boxed-primitive Array columns back to DataStream

2020-04-28 Thread Gyula Fóra
k for you? The other methods take > TypeInformation and might cause this problem. It is definitely a bug. > > Feel free to open an issue under: > https://issues.apache.org/jira/browse/FLINK-12251 > > Regards, > Timo > > On 28.04.20 18:44, Gyula Fóra wrote: > > Hi Timo,

Re: Converting String/boxed-primitive Array columns back to DataStream

2020-04-28 Thread Gyula Fóra
ng between old TypeInformation and new DataType > system. A back and forth conversion should work between those types. > > Regards, > Timo > > On 28.04.20 15:36, Gyula Fóra wrote: > > Hi All! > > > > I have a Table with columns of ARRAY and ARRAY, is there &g

Converting String/boxed-primitive Array columns back to DataStream

2020-04-28 Thread Gyula Fóra
Hi All! I have a Table with columns of ARRAY and ARRAY, is there any way to convert it back to the respective java arrays? String[] and Integer[] It only seems to work for primitive types (non null), date, time and decimal. For String for instance I get the following error: Query schema: [f0:

Re: Cannot map nested Tuple fields to table columns

2020-04-27 Thread Gyula Fóra
> f).getFunctionDefinition() == BuiltInFunctionDefinitions.AS &&* > *f.getChildren().get(0) instanceof > UnresolvedReferenceExpression) {* > *return false;* > *}* > > if (f instanceof UnresolvedReferenceExpression) { >

Re: Cannot map nested Tuple fields to table columns

2020-04-27 Thread Gyula Fóra
Hi Leonard, The tuple fields can also be referenced as their POJO names (f0, f1), they can be reordered similar to pojo fields, however you cannot alias them. (If you look at the link I have sent that shows how it is supposed to work but it throws an exception when I try it) Also what I am trying

Cannot map nested Tuple fields to table columns

2020-04-27 Thread Gyula Fóra
Hi All! I was trying to flatten a nested tuple into named columns with the fromDataStream method and I hit some problems with mapping tuple fields to column names. It seems like the `f0 as ColumnName` kind of expressions are not parsed correctly. It is very easy to reproduce:

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
by. And while checking the time attributes we would need to know > which table is bounded and what kind of changes are coming into the > streaming table. > > There is still a lot of work in the future to make the concepts smoother. > > Regards, > Timo > > > [0] https://is

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
and EnvironmentSettings's batchMode or streamingMode (newer versions). > > But we should admit that Flink hasn't finish the unification work. Your > case will also be considered in the > future when we want to further unify and simplify these concepts and > usages.

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
them, Flink will assume both table will be changing with > time. > > Best, > Kurt > > > On Mon, Apr 20, 2020 at 9:48 PM Gyula Fóra wrote: > >> Hi! >> >> The problem here is that I dont have a temporal table. >> >> I have a regular stream f

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
/streaming/joins.html#join-with-a-temporal-table > > Best, > Godfrey > > Gyula Fóra 于2020年4月20日周一 下午4:46写道: > >> Hi All! >> >> We hit a the following problem with SQL and trying to understand if there >> is a valid workaround. >> >> We have

Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
Hi All! We hit a the following problem with SQL and trying to understand if there is a valid workaround. We have 2 tables: *Kafka* timestamp (ROWTIME) item quantity *Hive* item price So we basically have incoming (ts, id, quantity) and we want to join it with the hive table to get the total

Re: Inserting nullable data into NOT NULL columns

2020-04-10 Thread Gyula Fóra
y, `NULLIF()` should do the trick in the query but unfortunately > the current Calcite behavior is not what one would expect. > > Thanks, > Timo > > > On 09.04.20 15:53, Gyula Fóra wrote: > > Hi All! > > > > We ran into a problem while trying to insert data read fro

Inserting nullable data into NOT NULL columns

2020-04-09 Thread Gyula Fóra
Hi All! We ran into a problem while trying to insert data read from kafka into a table sink where some of the columns are not nullable. The problem is that from Kafka we can only read nullable columns in JSON format otherwise you get the following error:

Re: Kerberos authentication for SQL CLI

2020-03-24 Thread Gyula Fóra
r. Therefore if you > kerberize the cluster the queries will use that configuration. > > On a different note. Out of curiosity. What would you expect the SQL CLI > to use the Kerberos authentication for? > > Best, > > Dawid > > On 24/03/2020 11:11, Gyula Fóra wrote

Kerberos authentication for SQL CLI

2020-03-24 Thread Gyula Fóra
Hi! Does the SQL CLI support Kerberos Authentication? I am struggling to find any use of the SecurityContext in the SQL CLI logic but maybe I am looking in the wrong place. Thank you! Gyula

Re: Writing retract streams to Kafka

2020-03-06 Thread Gyula Fóra
Thanks Kurt, I came to the same conclusions after trying what Jark provided. I can get similar behaviour if I reduce the grouping window to 1 sec but still keep the join window large. Gyula On Fri, Mar 6, 2020 at 3:09 PM Kurt Young wrote: > @Gyula Fóra I think your query is right, we sho

Re: How to override flink-conf parameters for SQL Client

2020-03-06 Thread Gyula Fóra
other >> job to this cluster, then all the configurations >> relates to process parameters like TM memory, slot number etc are not be >> able to modify. >> >> Best, >> Kurt >> >> >> On Thu, Mar 5, 2020 at 11:08 PM Gyula Fóra wrote: >> >>> Kurt

Re: How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
like visualization. >> >> If you are interested, you could try the master branch of Zeppelin + this >> improvement PR >> >> https://github.com/apache/zeppelin >> https://github.com/apache/zeppelin/pull/3676 >> https://github.com/apache/zeppelin/blob/master/doc

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
e > groups like (itemId, eventtime, queryId) have complete data or not. > As a comparison, if you change the grouping key to a window which based > only on q.event_time, then the query would emit insert only results. > > Best, > Kurt > > > On Thu, Mar 5, 2020 at 10:29

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
is calculated and fired. >> But with some other arbitrary aggregations, there is not enough >> information for Flink to determine whether >> the data is complete or not, so the framework will keep calculating >> results when receiving new records and >> retract earlier resu

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
cs-release-1.10/api/java/org/apache/flink/streaming/connectors/kafka/KafkaTableSink.html > > On Thu, Mar 5, 2020 at 2:50 PM Gyula Fóra wrote: > >> Hi Roman, >> >> This is the core logic: >> >> CREATE TABLE QueryResult ( >> queryIdBIGINT, >> i

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
Khachatryan Roman < khachatryan.ro...@gmail.com> wrote: > Hi Gyula, > > Could you provide the code of your Flink program, the error with > stacktrace and the Flink version? > > Thanks., > Roman > > > On Thu, Mar 5, 2020 at 2:17 PM Gyula Fóra wrote: > >>

Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
Hi All! Excuse my stupid question, I am pretty new to the Table/SQL API and I am trying to play around with it implementing and running a few use-cases. I have a simple window join + aggregation, grouped on some id that I want to write to Kafka but I am hitting the following error:

Re: How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
LI only allows to set Table >> specific configs. >> I will think it as a bug/improvement of SQL CLI which should be fixed in >> 1.10.1. >> >> Best, >> Jark >> >> On Thu, 5 Mar 2020 at 18:12, Gyula Fóra wrote: >> >>> Thanks Caizhi, >

Re: How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
client yaml file can only override some of the Flink configurations. > > Configuration entries indeed can only set Table specific configs, while > deployment entires are used to set the result fetching address and port. > There is currently no way to change the execution target from the

How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
Hi All! I am trying to understand if there is any way to override flink configuration parameters when starting the SQL Client. It seems that the only way to pass any parameters is through the environment yaml. There I found 2 possible routes: configuration: this doesn't work as it only sets

Re: CREATE TABLE with Schema derived from format

2020-03-04 Thread Gyula Fóra
it, and the CREATE TABLE statement can leave out schema part, e.g. > > CREATE TABLE user_behavior WITH ("connector"="kafka", > "topic"="user_behavior", "schema.registery.url"="localhost:8081") > > Which way are you looking for?

Re: CREATE TABLE with Schema derived from format

2020-03-04 Thread Gyula Fóra
20 to track this effort. But not sure we have enough > time to support it before 1.11. > > Best, > Jark > > [1]: https://issues.apache.org/jira/browse/FLINK-16420 > > > On Wed, 4 Mar 2020 at 18:21, Gyula Fóra wrote: > >> Hi All! >> >> I am wondering if it

CREATE TABLE with Schema derived from format

2020-03-04 Thread Gyula Fóra
Hi All! I am wondering if it would be possible to change the CREATE TABLE statement so that it would also work without specifying any columns. The format generally defines the available columns so maybe we could simply use them as is if we want. This would be very helpful when exploring

Re: SHOW CREATE TABLE in Flink SQL

2020-03-02 Thread Gyula Fóra
ded table" in the future but is > an orthogonal requirement. > > Best, > Jark > > > [1]: https://issues.apache.org/jira/browse/FLINK-16384 > > On Mon, 2 Mar 2020 at 22:09, Jeff Zhang wrote: > >> +1 for this, maybe we can add 'describe extended table' like hive

Re: [DISCUSSION] Using `DataStreamUtils.reinterpretAsKeyedStream` on a pre-partitioned Kafka topic

2019-12-01 Thread Gyula Fóra
Hi! As far as I know, even if you prepartition the data exactly the same way in kafka using the key groups, you have no guarantee that the kafka consumer source would pick up the right partitions. Maybe if you have exactly as many kafka partitions as keygroups/max parallelism, partitioned

Re: ArrayIndexOutOfBoundException on checkpoint creation

2019-11-29 Thread Gyula Fóra
Hi Theo! I have not seen this error before however I have encountered many strange things when using Kryo for serialization. From the stack trace it seems that this might indeed be a Kryo related issue. I am not sure what it is but what I would try is to change the state serializers to a non

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-27 Thread Gyula Fóra
You are right Aaron. I would say this is like this by design as Flink doesn't require you to initialize state in the open method so it has no safe way to delete the non-referenced ones. What you can do is restore the state and clear it on all operators and not reference it again. I know this

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Gyula Fóra
You might have to introduce some dummy keys for a more robust solution that integrates with the fault-tolerance mechanism. Gyula On Tue, Nov 5, 2019 at 9:57 AM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Thanks Piotr, > > the thing is that I am on Stream data and not on keyed

Re: [DISCUSS] Semantic and implementation of per-job mode

2019-10-31 Thread Gyula Fóra
Hi all, Regarding the compilation part: I think there are up and downsides to building the Flink job (running the main method) on the client side, however since this is the current way of doing it we should have a very powerful reason to change the default behaviour. While there is a possible

Re: Comparing Storm and Flink resource requirements

2019-10-22 Thread Gyula Fóra
ke state. > Our state was small at the time, and the main business was real-time ETL. > If it is a different type of business, the problem may be more complicated > and may require a specific analysis of the specific problem. > > Best, > Vino > > Gyula Fóra 于2019年10月21日周一 下午8

Comparing Storm and Flink resource requirements

2019-10-21 Thread Gyula Fóra
Hi All! I would like to ask the community for any experience regarding migration from Storm to Flink production applications. Specifically I am interested in your experience related to the resource requirements for the same pipeline as implemented in Flink vs in Storm. The design of the

Re: BucketingSink - Could not invoke truncate while recovering from state

2019-08-27 Thread Gyula Fóra
ice from the same task and the lease is not > "reentrant"? > > Cheers, > Kostas > > On Tue, Aug 27, 2019 at 4:53 PM Gyula Fóra wrote: > > > > Hi all! > > > > I am gonna try to resurrect this thread as I think I have hit the same > issue with the StreamingFileS

Re: BucketingSink - Could not invoke truncate while recovering from state

2019-08-27 Thread Gyula Fóra
Hi all! I am gonna try to resurrect this thread as I think I have hit the same issue with the StreamingFileSink: https://issues.apache.org/jira/browse/FLINK-13874 I don't have a good answer but it seems that we try to truncate before we get the lease (even though there is logic both in

Re: Custom log appender for YARN

2019-08-01 Thread Gyula Fóra
ssLoader.html > 2. > https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_classloading.html > > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, Jul 31, 2019 at 9:21 PM Gyula Fóra wrote: > >> Hi All! >> >> We are trying to configure

Custom log appender for YARN

2019-07-31 Thread Gyula Fóra
Hi All! We are trying to configure a custom Kafka log appender for our YARN application and we hit the following problem. We included the log appender dependency in the fatjar of the application because in YARN that should be part of the system class path. However when the YARN cluster

Re: RocksDB native checkpoint time

2019-05-14 Thread Gyula Fóra
Hey, I have collected some rocksdb logs for the snapshot itself but I cant really wrap my head around where exactly the time is spent: https://gist.github.com/gyfora/9a37aa349f63c35cd6abe2da2cf19d5b The general pattern where the time is spent is this: 2019/05/14-09:15:49.486455 7fbe6a8ee700

Re: RocksDB native checkpoint time

2019-05-03 Thread Gyula Fóra
ources how to speedup calls to > `org.rocksdb.Checkpoint#create`. > > Piotrek > > On 3 May 2019, at 10:30, Gyula Fóra wrote: > > Hi! > > Does anyone know what parameters might affect the RocksDB native > checkpoint time? (basically the sync part of the rocksdb increme

RocksDB native checkpoint time

2019-05-03 Thread Gyula Fóra
Hi! Does anyone know what parameters might affect the RocksDB native checkpoint time? (basically the sync part of the rocksdb incremental snapshots) It seems to take 60-70 secs in some cases for larger state sizes, and I wonder if there is anything we could tune to reduce this. Maybe its only a

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Gyula Fóra
Hi! I dont think there is any ongoing effort in core Flink other than this library we created. You are probably right that it is pretty hacky at the moment. I would say this one way we could do it that seemed convenient to me at the time I have written the code. If you have ideas how to

Re: Data loss when restoring from savepoint

2019-02-13 Thread Gyula Fóra
he problem has been > quite well narrowed down, considering that data can be found in savepoint, > savepoint is successfully restored, and after restoring the data doesn't go > to "user code" (like the reducer) any more. > > On Wed, Feb 13, 2019 at 3:47 PM Gyula Fóra wrote:

Re: Bug in RocksDB timer service

2019-01-15 Thread Gyula Fóra
ething like that? > > Best, > Stefan > > On 15. Jan 2019, at 09:48, Gyula Fóra wrote: > > Hi! > > Lately I seem to be hitting a bug in the rocksdb timer service. This > happens mostly at checkpoints but sometimes even at watermark: > > ja

Bug in RocksDB timer service

2019-01-15 Thread Gyula Fóra
Hi! Lately I seem to be hitting a bug in the rocksdb timer service. This happens mostly at checkpoints but sometimes even at watermark: java.lang.RuntimeException: Exception occurred while processing valve output watermark: at

Re: Using port ranges to connect with the Flink Client

2019-01-04 Thread Gyula Fóra
Hi, Thanks Chesnay my problem was fixed it was related to enabling port ranges for the rest client it turned out. Gyula On Fri, 4 Jan 2019 at 10:26, Chesnay Schepler wrote: > @Gyula: From what I can tell your custom client is still relying on > akka, and should be using the RestClusterClient

Re: Serious stability issues when running on YARN (Flink 1.7.0)

2018-12-21 Thread Gyula Fóra
t;> >> [1] https://issues.apache.org/jira/browse/FLINK-10941 >> >> >> On Dec 20, 2018, at 9:33 PM, Gyula Fóra wrote: >> >> Hi! >> >> Since we have moved to the new execution mode with Flink 1.7.0 we have >> observed some pretty bad stability issues with th

Re: Serious stability issues when running on YARN (Flink 1.7.0)

2018-12-20 Thread Gyula Fóra
. Gyula qi luo ezt írta (időpont: 2018. dec. 21., P, 3:35): > Hi Gyula, > > Your issue is possibly related to [1] that slots prematurely released. > I’ve raised a PR which is still pending review. > > [1] https://issues.apache.org/jira/browse/FLINK-10941 > > > On Dec 20, 2

Serious stability issues when running on YARN (Flink 1.7.0)

2018-12-20 Thread Gyula Fóra
Hi! Since we have moved to the new execution mode with Flink 1.7.0 we have observed some pretty bad stability issues with the Yarn execution. It's pretty hard to understand what's going on so sorry for the vague description but here is what seems to happen: In some cases when a bigger job fails

Re: Assigning a port range to rest.port

2018-12-05 Thread Gyula Fóra
/browse/FLINK-11081 > > Cheers, > Till > > On Wed, Dec 5, 2018 at 1:21 PM Gyula Fóra wrote: > >> Maybe the problem is here? cc Till >> >> >> https://github.com/apache/flink/blob/44ed5ef0fc1c221f3916ab5126f1bc8ee5dfb45d/flink-yarn/src/main/java/org/apache/fli

Re: Assigning a port range to rest.port

2018-12-05 Thread Gyula Fóra
t 2 flink scala-shell in local mode, but fails due to port conflict. > > > > Gyula Fóra 于2018年12月5日周三 下午8:04写道: > >> Hi! >> Is there any way currently to set a port range for the rest client? >> rest.port only takes a single number and it is anyways overwritten t

Assigning a port range to rest.port

2018-12-05 Thread Gyula Fóra
Hi! Is there any way currently to set a port range for the rest client? rest.port only takes a single number and it is anyways overwritten to 0. This seems to be necessary when running the flink client from behind a firewall where only a predefined port-range is accessible from the outside. I

Re: Using port ranges to connect with the Flink Client

2018-12-05 Thread Gyula Fóra
Ah, it seems to be something with the custom flink client build that we run... Still dont know why but if I use the normal client once the job is started it works. Gyula Gyula Fóra ezt írta (időpont: 2018. dec. 5., Sze, 9:50): > I get the following error when trying to savepoint a

Using port ranges to connect with the Flink Client

2018-12-04 Thread Gyula Fóra
Hi! We have been running Flink on Yarn for quite some time and historically we specified port ranges so that the client can access the cluster: yarn.application-master.port: 100-200 Now we updated to flink 1.7 and try to migrate away from the legacy execution mode but we run into a problem that

State processing and testing utilities (Bravo)

2018-11-07 Thread Gyula Fóra
Hey all! I just wanted to give you a quick update on the bravo project. Bravo contains a bunch of useful utilities for processing the checkpoint/savepoint state of a streaming job as Flink Datasets (batch). The end goal of the project is to be contributed to Flink once we are happy with it but

Re: Take RocksDB state dump

2018-10-17 Thread Gyula Fóra
Hi, If you dont mind a little trying out stuff I have some nice tooling for exactly this: https://github.com/king/bravo Let me know if it works :) Gyula Harshvardhan Agrawal ezt írta (időpont: 2018. okt. 17., Sze, 21:50): > Hello, > > We are currently using a RocksDBStateBackend for our

Re: Ship compiled code with broadcast stream ?

2018-10-09 Thread Gyula Fóra
You should not try sending the compiled code anywhere but you can use it from within the processor. You can do the same thing with the jar, you compile your jar, store it on HDFS. Send the jar path to the processor which can download the jar and instantiate the rule. No need to resubmit the job.

Re: Ship compiled code with broadcast stream ?

2018-10-09 Thread Gyula Fóra
Hi, This is certainly possible. What you can do is use a BroadcastProcessFunction where you receive the rule code on the broadcast side. You probably cannot send newly compiled objects this way but what you can do is either send a reference to some compiled jars and load them with the

Re: 1.5 Checkpoint metadata location

2018-09-25 Thread Gyula Fóra
Yes, the only workaround I found at the end was to restore the previous behavior where metadata files are written separately. But for this you need a custom Flink build with the changes to the check pointing logic. Gyula On Tue, 25 Sep 2018 at 16:45, Till Rohrmann wrote: > Hi Bryant, > > I

Re: Some questions about the StreamingFileSink

2018-08-22 Thread Gyula Fóra
Hi Kostas, Sorry for jumping in on this discussion :) What you suggest for finite sources and waiting for checkpoints is pretty ugly in many cases. Especially if you would otherwise read from a finite source (a file for instance) and want to end the job asap. Would it make sense to not discard

Re: Events can overtake watermarks

2018-07-23 Thread Gyula Fóra
Yea, now that I think about it, thats probably the case. Sorry to bother :) Gyula Gyula Fóra ezt írta (időpont: 2018. júl. 23., H, 11:04): > Hm I wonder it could be because the downstream operator is a 2 input > operator and I do some filtering on the source elements to direct

Re: Events can overtake watermarks

2018-07-23 Thread Gyula Fóra
even though the other one should also have an element. Gyula Gyula Fóra ezt írta (időpont: 2018. júl. 23., H, 10:44): > Hi guys, > > Let me clarify. There is a single source with parallelism 1 and a single > downstream operator with parallelism > 1. > So the watermark is s

Re: Events can overtake watermarks

2018-07-23 Thread Gyula Fóra
ike a „wrong“ behaviour, only > watermarks overtaking events would be bad. Do you think this only stated > from Flink 1.5? To me this does not sound like a problem, but not sure if > it is intended. Looping in Aljoscha, just in case. > > Best, > Stefan > > > Am 22.07.2018 u

Events can overtake watermarks

2018-07-22 Thread Gyula Fóra
Hi, In 1.5.1 I have noticed some strange behaviour that happens quite frequently and I just want to double check with you that this is intended. If I have a non-parallel source that takes the following actions: emit: event1 emit: watermark1 emit: event2 it can happen that a downstream operators

Re: External checkpoint metadata in Flink 1.5.x

2018-07-13 Thread Gyula Fóra
g if checkpoints are in a central location? > > Best, > Aljoscha > > > On 12. Jul 2018, at 17:55, Gyula Fóra wrote: > > Hi! > > Well it depends on how we look at it FLINK-5627 > <https://issues.apache.org/jira/browse/FLINK-5627> is not necessarily > the c

Re: External checkpoint metadata in Flink 1.5.x

2018-07-12 Thread Gyula Fóra
h aims at implementing a solution: > https://issues.apache.org/jira/browse/FLINK-9114. > > I quickly talked to Stephan, it seems to be that the meta info about > externalized checkpoints is also written to the HA storage directory, maybe > that's helpful for you. > > Best, > Al

External checkpoint metadata in Flink 1.5.x

2018-07-12 Thread Gyula Fóra
Hi, It seems that the behaviour to store the checkpoint metadata files for externalized checkpoints changed from 1.4 to 1.5 and the docs seem to be incorrectly saying that: "state.checkpoints.dir: The target directory for meta data of externalized checkpoints

Re: Custom metrics reporter classloading problem

2018-07-11 Thread Gyula Fóra
JM/TM starts, i.e. before any user-code is > even accessible. > > My recommendation would be to either put the kafka dependencies in the > /lib folder or try to relocate the kafka code in the reporter. > > On 11.07.2018 14:59, Gyula Fóra wrote: > > Hi all, > > > > I

Custom metrics reporter classloading problem

2018-07-11 Thread Gyula Fóra
Hi all, I have ran into the following problem and I want to double check wether this is intended behaviour. I have a custom metrics reporter that pushes things to Kafka (so it creates a KafkaProducer in the open method etc.etc.) for my streaming job. Naturally as my Flink job consumes from

Re: PartitionNotFoundException after deployment

2018-05-04 Thread Gyula Fóra
; > messages like "Retriggering partition request {}:{}." > > > > You can also check the SingleInputGate code which has the logic for > > retriggering requests. > > > > – Ufuk > > > > > > On Fri, May 4, 2018 at 10:27 AM, Gyula Fóra <gyula.f...@g

PartitionNotFoundException after deployment

2018-05-04 Thread Gyula Fóra
Hi Ufuk, Do you have any quick idea what could cause this problems in flink 1.4.2? Seems like one operator takes too long to deploy and downstream tasks error out on partition not found. This only seems to happen when the job is restored from state and in fact that operator has some keyed and

Re: Beam quickstart

2018-04-26 Thread Gyula Fóra
For some reason it only seems to work if I put my jars in the Flink lib folder. I am not sure why though... Gyula Jörn Franke <jornfra...@gmail.com> ezt írta (időpont: 2018. ápr. 25., Sze, 16:50): > Tried with a fat jar to see if it works in general ? > > > On 25. Apr 2018, at

Beam quickstart

2018-04-25 Thread Gyula Fóra
Hey, Is there somewhere an end to end guide how to run a simple beam-on-flink application (preferrably using Gradle)? I want to run it using the standard per-job yarn cluster setup but I cant seem to get it to work. I always end up having strange NoSuchMethod errors from protobuf and have spent

Strange Kafka consumer behaviour

2018-02-21 Thread Gyula Fóra
Hi, I have observed a weird behaviour when changing kafka topics when restoring from a checkpoint. It seems that the job started consuming both the topics from the state, and the new topic that I assigned. This happened while changing from kafka 08 to kafka 10. Is this expected? Thanks, Gyula

Re: Flink 1.3 -> 1.4 Kafka state migration issue

2018-01-08 Thread Gyula Fóra
for details. > > Best, > Gordon > > [1] http://flink.apache.org/news/2017/08/05/release-1.3.2.html > > > > On Jan 8, 2018 6:57 AM, "Gyula Fóra" <gyula.f...@gmail.com> wrote: > > Migrating the jobs by setting the sources to parallelism = 1 and then > scale b

Re: Flink 1.3 -> 1.4 Kafka state migration issue

2018-01-08 Thread Gyula Fóra
Migrating the jobs by setting the sources to parallelism = 1 and then scale back up after migration seems to be a good workaround, but I am wondering if something I do made this happen or this is a bug. Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2018. jan. 8., H, 14:46):

Flink 1.3 -> 1.4 Kafka state migration issue

2018-01-08 Thread Gyula Fóra
Hi, Is it possible that the Kafka partition assignment logic has changed between Flink 1.3 and 1.4? I am trying to migrate some jobs using Kafka 0.8 sources and about half my jobs lost offset state for some partitions (but not all partitions). Jobs with parallelism 1 dont seem to be affected...

Re: JVM crash - SIGSEGV in ZIP_GetEntry

2017-12-17 Thread Gyula Fóra
Hi, I have seen similar errors when trying to serialize Kryo-typeserializers with Flink type infos accidentally. Maybe that helps :) Gyula On Sun, Dec 17, 2017, 15:52 Dawid Wysakowicz wrote: > Just as a follow-up I tried disabling mmap with >

Re: Flink with Ceph as the persistent storage

2017-12-05 Thread Gyula Fóra
h happens asynchronously, does it still have > any impact on the stream processing? > > Jayant Ameta > > On Tue, Dec 5, 2017 at 4:34 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > >> Hi, >> >> To my understanding Ceph as in http://ceph.com/ceph-storage/ is

Re: Flink with Ceph as the persistent storage

2017-12-05 Thread Gyula Fóra
Hi, To my understanding Ceph as in http://ceph.com/ceph-storage/ is a block based object storage system. You can use it mounted to your server and will behave as a local file system to most extent but will be shared in the cluster. The performance might not be as good as with HDFS to our

Re: Empty state restore seems to be broken for Kafka source (1.3.2)

2017-10-12 Thread Gyula Fóra
The issue in Kafka is about new topics/partitions not being discovered or > something else? That would be the expected behaviour in Flink < 1.4.0. > > Best, > Aljoscha > > On 12. Oct 2017, at 16:40, Gyula Fóra <gyf...@apache.org> wrote: > > Hey, > > I know

Re: Empty state restore seems to be broken for Kafka source (1.3.2)

2017-10-12 Thread Gyula Fóra
Hey, I know it's old discussion but there also seems to be a problem with the logic in the kafka source alone regarding new topics added after a checkpoint. Maybe there is a ticket for this already and I just missed it. Cheers, Gyula Gyula Fóra <gyf...@apache.org> ezt írta (időpont

Re: Empty state restore seems to be broken for Kafka source (1.3.2)

2017-09-14 Thread Gyula Fóra
it will know if it > discovers a new partition whether it can take ownership of that partition. > > I'm sure Gordon (cc'ed) could explain it better than I did. > > On 6. Sep 2017, at 14:36, Gyula Fóra <gyf...@apache.org> wrote: > > Wouldnt it be enough that Kafka source

Re: Empty state restore seems to be broken for Kafka source (1.3.2)

2017-09-06 Thread Gyula Fóra
> > Thanks for the report, I will take a look. > > Am 06.09.2017 um 11:48 schrieb Gyula Fóra <gyf...@apache.org>: > > Hi all, > > We are running into some problems with the kafka source after changing the > uid and restoring from the savepoint. > What we are exp

Empty state restore seems to be broken for Kafka source (1.3.2)

2017-09-06 Thread Gyula Fóra
Hi all, We are running into some problems with the kafka source after changing the uid and restoring from the savepoint. What we are expecting is to clear the partition state, and set it up all over again, but what seems to happen is that the consumer thinks that it doesnt have any partitions

Re: Using latency markers

2017-08-11 Thread Gyula Fóra
gt; Hi, > > I must admit that I've never used this but I'll try and look into it. > > Best, > Aljoscha > > On 10. Aug 2017, at 11:10, Gyula Fóra <gyula.f...@gmail.com> wrote: > > Hi all! > > Does anyone have a working example of using the latency markers

Re: Delay between job manager recovery and job recovery (1.3.2)

2017-08-10 Thread Gyula Fóra
Best, > Aljoscha > > On 10. Aug 2017, at 15:13, Gyula Fóra <gyula.f...@gmail.com> wrote: > > Here is actually the whole log for the relevant parts at least: > https://gist.github.com/gyfora/b70dd18c048b862751b194f613514300 > > Sorry for not pasting it earlier. > >

Re: Delay between job manager recovery and job recovery (1.3.2)

2017-08-10 Thread Gyula Fóra
Here is actually the whole log for the relevant parts at least: https://gist.github.com/gyfora/b70dd18c048b862751b194f613514300 Sorry for not pasting it earlier. Gyula Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2017. aug. 10., Cs, 15:04): > Oh, I found this in the log t

Re: Delay between job manager recovery and job recovery (1.3.2)

2017-08-10 Thread Gyula Fóra
> Hi, > > Let me also investigate that? Did you observe this in 1.3.2 and not in > 1.3.0 and/or 1.3.1 or did you directly go from 1.2.x to 1.3.2? > > Best, > Aljoscha > > On 10. Aug 2017, at 13:31, Gyula Fóra <gyula.f...@gmail.com> wrote: > > Hi! > In som

Delay between job manager recovery and job recovery (1.3.2)

2017-08-10 Thread Gyula Fóra
Hi! In some cases it seems to take a long time for the job to start the zookeeper based job recovery after recovering from a JM failure. Looking at the logs there is a 2 minute gap between the last recovered TM was started successfully and the job recovery: 2017-08-10 13:14:06,369 INFO

Using latency markers

2017-08-10 Thread Gyula Fóra
Hi all! Does anyone have a working example of using the latency markers to test for the topology latency? We are using Flink 1.3.2 and it seems like however we tune it, whatever job we use all we get is NaN in the metrics. Maybe we are completely missing something... Thanks! Gyula

Re: High back-pressure after recovering from a save point

2017-07-14 Thread Gyula Fóra
It will work if you assign a new uid to the Kafka source. Gyula On Fri, Jul 14, 2017, 18:42 Tzu-Li (Gordon) Tai wrote: > One thing: do note that `FlinkKafkaConsumer#setStartFromLatest()` does not > have any effect when starting from savepoints. > i.e., the consumer will

Re: Why would a kafka source checkpoint take so long?

2017-07-14 Thread Gyula Fóra
<se...@apache.org> ezt írta (időpont: 2017. júl. 12., Sze, 15:27): > Can it be that the checkpoint thread is waiting to grab the lock, which is > held by the chain under backpressure? > > On Wed, Jul 12, 2017 at 12:23 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > >&g

Re: sanity check in production

2017-07-12 Thread Gyula Fóra
Hi! Assuming you have some spare compute resources on your cluster (which you should have in a production setting to be safe). I think 2) would be the best option, ideally started from a savepoint of the production job to verify your state logic as well. You could also run the test job on a

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-12 Thread Gyula Fóra
+1 for dropping 1.7 from me as well. Gyula On Wed, Jul 12, 2017, 17:53 Ted Yu wrote: > +1 on dropping support for Java 1.7 > > Original message > From: Robert Metzger > Date: 7/12/17 8:36 AM (GMT-08:00) > To: d...@flink.apache.org >

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Gyula Fóra
> Can it be that the checkpoint thread is waiting to grab the lock, which is > held by the chain under backpressure? > > On Wed, Jul 12, 2017 at 12:23 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > >> Yes thats definitely what I am about to do next but just thought maybe >> someone has

Re: Why would a kafka source checkpoint take so long?

2017-07-12 Thread Gyula Fóra
com> wrote: > Hi, > > could you introduce some logging to figure out from which method call the > delay is introduced? > > Best, > Stefan > > Am 12.07.2017 um 11:37 schrieb Gyula Fóra <gyula.f...@gmail.com>: > > Hi, > > We are using the latest 1.3.

<    1   2   3   4   >