Re: Flink Statefun and Feature computation

2022-03-10 Thread Federico D'Ambrosio
w.youtube.com/channel/UCY8_lgiZLZErZPF47a2hXMA/videos > > On Sun, Feb 20, 2022 at 1:54 PM Federico D'Ambrosio > wrote: > >> Hello everyone, >> >> It's been quite a while since I wrote to the Flink ML, because in my >> current job never actually arose the need for a

Flink Statefun and Feature computation

2022-02-20 Thread Federico D'Ambrosio
use case, or we're better off with a custom flink job. Thank you for your time, -- Federico D'Ambrosio

Help with the correct Event Pattern

2019-07-25 Thread Federico D'Ambrosio
functionalities of Flink CEP. What's the best way to achieve what I want? Is it possible? Should I even use any AfterMatchSkipStrategy? Thank you, Federico D'Ambrosio

Re: Date Handling in Table Source and transformation to DataStream of case class

2019-07-24 Thread Federico D'Ambrosio
> Hi Federico, >>> >>> 1) As far as I know, you can't set a format for timestamp parsing >>> currently (see `SqlTimestampParser`, it just feeds your string to >>> `SqlTimestamp.valueOf`, so your timestamp format must be compatible with >>> SqlTimestam

Re: Date Handling in Table Source and transformation to DataStream of case class

2019-07-24 Thread Federico D'Ambrosio
> SqlTimestamp). > > 2) How do you define your case class? You have to define its parameter > list and nothing in its body to make it work. For example: case class > Event(a: String, b: String, time: Timestamp) > > Federico D'Ambrosio 于2019年7月24日周三 下午4:10写道: > &g

Date Handling in Table Source and transformation to DataStream of case class

2019-07-24 Thread Federico D'Ambrosio
;. What's the correct way to handle case classes? I changed to using a class (which I believe uses the POJO serializer) and it works ok, but I'm still wondering how to make it work with Case Classes, which come quite useful sometimes. Thank you very much, Federico -- Federico D'Ambrosio

Re: Async Source Function in Flink

2018-05-17 Thread Federico D'Ambrosio
-docs-master/dev/stream/operators/asyncio.html#the-need-for-asynchronous-io-operations > > > Am 14.05.18 um 14:51 schrieb Federico D'Ambrosio: > > Hello everyone, > > just wanted to ask a quick question: I have to retrieve data from 2 web > services via REST calls, use th

Async Source Function in Flink

2018-05-14 Thread Federico D'Ambrosio
, for each REST call, Await.result(). Do I need to use Flink's AsyncFunction instead? What are the best practices when it comes to AsyncSources? Thank you, -- Federico D'Ambrosio

Re: Timestamp from Kafka record and watermark generation

2018-02-23 Thread Federico D'Ambrosio
t by copying the code of AscendingTimestampExtractor. > > Sorry for the inconvenience. > > -- > Aljoscha > > On 22. Feb 2018, at 12:05, Federico D'Ambrosio <fedex...@gmail.com> wrote: > > Hello everyone, > > I'm consuming from a Kafka topic, on which I'm writing with a > F

Timestamp from Kafka record and watermark generation

2018-02-22 Thread Federico D'Ambrosio
Hello everyone, I'm consuming from a Kafka topic, on which I'm writing with a FlinkKafkaProducer, with the timestamp relative flag set to true. >From what I gather from the documentation [1], Flink is aware of Kafka Record's timestamp and only the watermark should be set with an appropriate

Re: How to deal with java.lang.IllegalStateException: Could not initialize keyed state backend after a code update

2017-11-28 Thread Federico D'Ambrosio
use Java serialization for user state; ideally we would > want that to be completed removed in the future. > > Cheers, > Gordon > > > On 28 November 2017 at 10:02:19 PM, Federico D'Ambrosio ( > federico.dambro...@smartlab.ws) wrote: > > Hi, > > I recently had to d

How to deal with java.lang.IllegalStateException: Could not initialize keyed state backend after a code update

2017-11-28 Thread Federico D'Ambrosio
of the job, because of restarting it from the very beginning. -- Federico D'Ambrosio

Re: Flink session on yarn

2017-11-20 Thread Federico D'Ambrosio
Cluster(HDP3.6). According to the documentation, > set HADOOP_CONF_DIR and YARN_CONF_DIR as well. > > Any inputs will be really helpful. Thanks! > > -- > Thanks & Regards, > Nishu Tayal > -- Federico D'Ambrosio

Re: FlinkCEP behaviour with time constraints not as expected

2017-11-08 Thread Federico D'Ambrosio
rable[Event]], timestamp: Long) => > TimeoutEvent() > } { > pattern: Map[String, Iterable[Event]] => ComplexEvent() > } > > This syntax is only available in 1.4 though, in previous versions > timeouted events were not returned via sideOutput. > > > > > On 8 Nov 2017, at 1

Re: FlinkCEP behaviour with time constraints not as expected

2017-11-08 Thread Federico D'Ambrosio
vent with value <100 (in fact there was no event at all to be > checked). > > Hope this "study" helps you understand the behaviour. If you feel I missed > something, please provide some example I could reproduce. > > Regards, > Dawid > > 2017-11-07 11:29 GMT+

FlinkCEP behaviour with time constraints not as expected

2017-11-06 Thread Federico D'Ambrosio
.000}]} Now, shouldn't they be in the same List, as they belong to the same iterative pattern, defined with the oneOrMore clause? Thank you for your insight, Federico D'Ambrosio

Re: FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Federico D'Ambrosio
ease is going to be next week, and I would > expect 1 or 2 more weeks testing. > So I would say in 2.5 weeks. But this is of course subject to potential > issues we may find during testing. > > Cheers, > Kostas > > On Nov 3, 2017, at 4:22 PM, Federico D'Ambrosio < > feder

Re: FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Federico D'Ambrosio
ug that seems similar to your case: > https://issues.apache.org/jira/browse/FLINK-7756 > > Could you try the current master to see if it fixes your problem? > > Thanks, > Kostas > > On Nov 3, 2017, at 3:12 PM, Federico D'Ambrosio < > federico.dambro...@s

Re: FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Federico D'Ambrosio
utStream$BlockDataInputStream.skipBlockData(ObjectInputStream.java:2455) at java.io.ObjectInputStream.skipCustomData(ObjectInputStream.java:1951) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1621) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io

FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Federico D'Ambrosio
tStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:20 00) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1 801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserializeCondition(NFA.j ava:1211) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserializeStates(NFA.java :1169) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserialize(NFA.java:957) at org.apache.flink.cep.nfa.NFA$NFASerializer.deserialize(NFA.java:852) at org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders$State TableByKeyGroupReaderV2V3.readMappingsInKeyGroup(StateTableByKeyGroupReaders.jav a:132) at org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restorePart itionedState(HeapKeyedStateBackend.java:518) at org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restore(Hea pKeyedStateBackend.java:397) at org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateB ackend(StreamTask.java:772) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initK eyedState(AbstractStreamOperator.java:311) ... 6 more What is happening here? Am I doing something wrong? Is there some sort of conflict between within clauses deadlines and checkpoint deadlines? I found the following similar JIRA pages, but none of those mention circular references: https://issues.apache.org/jira/browse/FLINK-6321 https://issues.apache.org/jira/browse/FLINK-7484 https://issues.apache.org/jira/browse/FLINK-7756 Kind Regards, Federico D'Ambrosio

Could not initialize keyed state backend on restart from checkpoint

2017-10-24 Thread Federico D'Ambrosio
o stress that the serializer has always been in the classpath, inside the uber-jar and no change of implementation was made in between executions. I reproduced this behaviour by commenting in and out this sink, rebuilding and restarting the job both from a savepoint and an externalized checkpoint. Do you have any insight on this? Cheers, Federico D'Ambrosio

FlinkCEP: pattern application on a KeyedStream

2017-10-19 Thread Federico D'Ambrosio
EP.pattern(pattern, keyedStream) val anotherKeyedStream = patternKeyedStream.select(...) should only check the pattern on each single partition value. Am I correct in assuming this, or I have misunderstood CEP functioning? -- Federico D'Ambrosio

Re: ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-10-12 Thread Federico D'Ambrosio
watermarks? I'm > very eager to figure out what is actually going wrong with asynchronous > checkpoints. > > Best, > Aljoscha > > > On 2. Oct 2017, at 11:57, Federico D'Ambrosio < > federico.dambro...@smartlab.ws> wrote: > > As a followup: > &

Re: How to deal with blocking calls inside a Sink?

2017-10-02 Thread Federico D'Ambrosio
would first try the AsyncIO approach, because actually this is a use > case it was made for. > > Regards, > Timo > > > Am 10/2/17 um 11:53 AM schrieb Federico D'Ambrosio: > > Hi, I've implemented a sink for Hive as a RichSinkFunction, but once I've > integrated

Re: ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-10-02 Thread Federico D'Ambrosio
As a followup: the flink job has currently an uptime of almost 24 hours, with no checkpoint failed or restart whereas, with async snapshots, it would have already crashed 50 or so times. Regards, Federico 2017-09-30 19:01 GMT+02:00 Federico D'Ambrosio < federico.dambro...@smartlab.ws>: &

How to deal with blocking calls inside a Sink?

2017-10-02 Thread Federico D'Ambrosio
to be made in the asyncInvoke method of the AsyncFunction? Any suggestion is appreciated. Kind regards, Federico D'Ambrosio

ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-09-29 Thread Federico D'Ambrosio
"latest_time") What could be the problem here in your opinion? It's not emitting watermarks correctly? I'm not even how I could reproduce this exceptions, since it looks like they happen pretty much randomly. Thank you all, Federico D'Ambrosio

Question about checkpointing with stateful operators and state recovery

2017-09-28 Thread Federico D'Ambrosio
Hi, I've got a couple of questions concerning the topics in the subject: 1. If an operator is getting applied on a keyed stream, do I still have to implement the CheckpointedFunction trait and define the snapshotState and initializeState methods, in order to successfully recover the state

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

2017-09-25 Thread Federico D'Ambrosio
discard //Other MergeStrategies } 2017-09-25 11:48 GMT+02:00 Federico D'Ambrosio < federico.dambro...@smartlab.ws>: > Hi Urs, > > Thank you very much for your advice, I will look into excluding those > files directly during the assembly. > > 2017-09-25 10:58 GMT+02

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

2017-09-25 Thread Federico D'Ambrosio
ys, or simply > one that you don't need anyways. > > Best, > Urs > > On 25.09.2017 10:09, Federico D'Ambrosio wrote: > > Hi Urs, > > > > Yes the main class is set, just like you said. > > > > Still, I might have managed to get it working: during the a

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

2017-09-25 Thread Federico D'Ambrosio
this weird issue. Using an appropriate pattern for discarding the files during the assembly or removing them via zip -d should be enough (I sure hope so, since this is some of the worst issues I've come across). Federico D'Ambrosio Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <ur

Unable to run flink job after adding a new dependency (cannot find Main-class)

2017-09-23 Thread Federico D'Ambrosio
Hello everyone, I'd like to submit to you this weird issue I'm having, hoping you could help me. Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2 compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6) So, I'm trying to implement an sink for Hive so I added the

Using HiveBolt from storm-hive with Flink-Storm compatibility wrapper

2017-09-22 Thread Federico D'Ambrosio
Hello everyone, I'd like to use the HiveBolt from storm-hive inside a flink job using the Flink-Storm compatibility layer but I'm not sure how to integrate it. Let me explain, I would have the following: val mapper = ... val hiveOptions = ... streamByID .transform[OUT]("hive-sink", new

Re: [DISCUSS] Dropping Scala 2.10

2017-09-21 Thread Federico D'Ambrosio
Ok, thank you all for the clarification. @Stephan: I'm using Kafka 0.10, I guess the problem I had then was actually unrelated to specific Kafka version Federico D'Ambrosio Il 21 set 2017 16:30, "Stephan Ewen" <se...@apache.org> ha scritto: > +1 > > @ Frederico:

Re: [DISCUSS] Dropping Scala 2.10

2017-09-20 Thread Federico D'Ambrosio
r Akka versions only support Scala 2.11/2.12 > and we need a backported feature. > > > > Are there any concerns about this? > > > > Best, > > Aljoscha > > -- Federico D'Ambrosio

Re: Dot notation not working for accessing case classes nested fields

2017-09-15 Thread Federico D'Ambrosio
Great, thanks! The fact that it's actually written on the documentation is really misleading. Thank you very much for your response Federico D'Ambrosio Il 15 set 2017 13:26, "Gábor Gévay" <gga...@gmail.com> ha scritto: > Hi Federico, > > Sorry, nested field expr

Dot notation not working for accessing case classes nested fields

2017-09-14 Thread Federico D'Ambrosio
Hi, I have the following case classes: case class Event(instantValues: InstantValues) case class InstantValues(speed: Int, altitude: Int, time: DateTime) in a DataStream[Event] I'd like to perform a maxBy operation on the field time of instantValue for each event and according to the docs here

Re: Is State access synchronized?

2017-09-11 Thread Federico D'Ambrosio
y the same thread (and never > concurrently) the ValueState (or any state for that matter) will always > return the latest values. > > > On 10.09.2017 14:39, Federico D'Ambrosio wrote: > > Hi, > > as per the mail subject I wanted to ask you if a State access (read and > write)

Is State access synchronized?

2017-09-10 Thread Federico D'Ambrosio
Hi, as per the mail subject I wanted to ask you if a State access (read and write) is synchronized. I have the following stream: val airtrafficEvents = stream .keyBy(_.flightInfo.flight) .map(new UpdateIdFunction()) where UpdateIdFunction is a RichMapFunction with a ValueState and a MapState,

Re: BlobCache and its functioning

2017-08-31 Thread Federico D'Ambrosio
Nico > > On Thursday, 31 August 2017 11:51:27 CEST Federico D'Ambrosio wrote: > > Hi, > > > > 1) I'm using Flink 1.3.2 > > > > 2) Th JobManager log is pretty much the same concerning those lines: > > > > 2017-08-30 14:16:53,343 INFO > > org.apache.zook

Re: BlobCache and its functioning

2017-08-31 Thread Federico D'Ambrosio
ou? > > > Nico > > > On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote: > > Hi Federico, > > > > Not sure what's going on there but Nico (in CC) is more familiar with the > > blob cache and might be able to help. > > > > Best, Fa

BlobCache and its functioning

2017-08-30 Thread Federico D'Ambrosio
do i speed up this behaviour? Thank you very much for your attention, Federico D'Ambrosio

Re: Flink session on Yarn - ClassNotFoundException

2017-08-30 Thread Federico D'Ambrosio
his: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions Federico D'Ambrosio Il 30 ago 2017 13:33, "albert" <alb...@datacamp.com> ha scritto: > Hi Chesnay, > > Thanks for your reply. I did download the binaries matching my H

Re: The implementation of the RichSinkFunction is not serializable.

2017-08-28 Thread Federico D'Ambrosio
and connection as transient wouldn't make any difference. Regards 2017-08-27 14:22 GMT+02:00 Federico D'Ambrosio < federico.dambro...@smartlab.ws>: > Hi, > > could you elaborate, please? Marking conf, connection and table as > transient wouldn't help because of the presence of th

Re: The implementation of the RichSinkFunction is not serializable.

2017-08-27 Thread Federico D'Ambrosio
ializable. An > alternative would be to mark certain non-serializable things as transient, > but as far as I see this is not possible in your case. > > On 27. Aug 2017, at 11:02, Federico D'Ambrosio < > federico.dambro...@smartlab.ws> wrote: > > Hi, > > I'm trying to wr

The implementation of the RichSinkFunction is not serializable.

2017-08-27 Thread Federico D'Ambrosio
Hi, I'm trying to write on HBase using writeOutputFormat using a custom HBase format inspired from this example in flink-hbase (mind you, I'm

Re: Possible conflict between in Flink connectors

2017-08-25 Thread Federico D'Ambrosio
() >> [ERROR] location: variable replicationEndpoint of type >> ReplicationEndpoint >> >> Maybe you can shade the guava dependency in hbase 1.3 >> >> In the upcoming hbase 2.0 release, third party dependencies such as guava >> and netty are shaded. Meaning t

Possible conflict between in Flink connectors

2017-08-25 Thread Federico D'Ambrosio
+- org.apache.hbase:hbase-prefix-tree:1.3.0 (depends on 4.12) [warn] +- org.apache.hbase:hbase-procedure:1.3.0 (depends on 4.12) [warn] +- org.apache.hbase:hbase-client:1.3.0 (depends on 4.12) [warn] +- org.apache.hbase:hbase-common:1.3.0 (depends on 4.12) [warn] +- org.apache.hbase:hbase-server:1.3.0 (depends on 4.12) [warn] +- jline:jline:0.9.94 (depends on 3.8.1) And here I am asking for help here. Thank you very much for your attention, Kind regards, Federico D'Ambrosio