Re: Can not run scala-shell in yarn mode in flink 1.5

2018-06-06 Thread Till Rohrmann
Hi Jeff, thanks for reporting this issue. The corresponding JIRA issue is here [1]. Until the issue is fixed, I would recommend switching to the legacy mode via setting `mode: legacy` in your flink-conf.yaml. [1] https://issues.apache.org/jira/browse/FLINK-8795 Cheers, Till On Wed, Jun 6, 2018

Re: why does flink release package preferred uber jar than small jar?

2018-06-06 Thread Fabian Hueske
One reason is that we shade away several of dependencies to avoid version conflicts with user dependencies or dependencies of internal dependencies. Best, Fabian 2018-06-05 4:07 GMT+02:00 makeyang : > thanks rongrong, but it seems unrelevant. > > > > -- > Sent from: http://apache-flink-user-mail

Re: TaskManager use more memory than Xmx

2018-06-06 Thread Fabian Hueske
Hi, Flink uses a few libraries that allocate direct (off-heap) memory (Netty, RocksDB). Flink can also allocate direct memory by itself (only relevant for batch setups though). Therefore, Xmx controls only one part of Flink's memory footprint. Best, Fabian 2018-06-04 16:48 GMT+02:00 aitozi : >

Stack Overflow Question

2018-06-06 Thread 孙森
[apache-flink]An input of GenericTypeInfo cannot be converted to Table. Please specify the type of the input with a RowTypeInfo https://stackoverflow.com/q/50718451/6059691?sem=2

Re: File does not exist prevent from Job manager to start .

2018-06-06 Thread miki haiat
I had some zookeeper errors that crashed the cluster ERROR org.apache.flink.shaded.org.apache.curator.ConnectionState - Authentication failed What happen to Flink checkpoint and state if zookeeper cluster is crashed ? Is it possible that the checkpoint/state is written in zookeeper but n

Re: Implementing a “join” between a DataStream and a “set of rules”

2018-06-06 Thread Fabian Hueske
Hi Turar, Managed state is a general concept in Flink's DataStream API and not specifically designed for windows (although they use internally). I'd recommend the broadcast state that Aljoscha proposed. It was specifically designed for these use cases. It is true that the state is currently maint

Datastream[Row] covert to table exception

2018-06-06 Thread 孙森
Hi , I've tried to to specify such a schema, when I read from kafka, and covert inputstream to table . But I got the exception: Exception in thread "main" org.apache.flink.table.api.TableException: An input of GenericTypeInfo cannot be converted to Table. Please specify the type of th

Re: How to read from Cassandra using Apache Flink?

2018-06-06 Thread HarshithBolar
I figured out a way to solve this by writing my own code, but would love to know if there are better - more efficient solutions. Here's the answer - https://stackoverflow.com/questions/50697296/how-to-read-from-cassandra-using-apache-flink/50721953#50721953 @Chesnay I've been wondering about this.

Extending stream events with a an aggregate value

2018-06-06 Thread Nicholas Walton
I’m sure I’m being a complete idiot, since this seems so trivial but if someone could point me in the right direction I’d be very grateful. I have a simple data stream [(Int, Double)] keyed on the Int. I can calculate the running max of the stream no problem using “.max(2)”. But I want to output

Conceptual question

2018-06-06 Thread TechnoMage
We are still pretty new to Flink and I have a conceptual / DevOps question. When a job is modified and we want to deploy the new version, what is the preferred method? Our jobs have a lot of keyed state. If we use snapshots we have old state that may no longer apply to the new pipeline. If we

Re: HA stand alone cluster error

2018-06-06 Thread Gary Yao
Hi Miki, Sorry for the late reply. If you are able to reproduce the first problem, it would be good to see the complete JobManager logs. The second exception indicates that you have not removed all data from ZooKeeper. On recovery, Flink looks up the locations of the submitted JobGraphs in ZooKee

FINAL REMINDER: Apache EU Roadshow 2018 in Berlin next week!

2018-06-06 Thread sharan
Hello Apache Supporters and Enthusiasts This is a final reminder that our Apache EU Roadshow will be held in Berlin next week on 13th and 14th June 2018. We will have 28 different sessions running over 2 days that cover some great topics. So if you are interested in Microservices, Internet of

Checkpointing Large State to S3

2018-06-06 Thread Gregory Fee
Hello Everyone! I am running some streaming Flink jobs using SQL and the table api. I enabled incremental checkpointing to S3 via the RocksDBStateBackend. Even giving it an hour to checkpoint, the checkpoints all fail by timing out. Does anyone have an tips on how to configure the RocksDBStateBack

Re: Datastream[Row] covert to table exception

2018-06-06 Thread 孙森
I’m sorry, the whole code is: class WormholeDeserializationSchema(schema: String) extends KeyedDeserializationSchema[Row] { //var keyValueTopic:KeyValueTopic = _ override def deserialize(messageKey: Array[Byte], message: Array[Byte], topic: String, partition: Int, offset: Long) = {

Re: Datastream[Row] covert to table exception

2018-06-06 Thread Timo Walther
Hi, Row is a very special datatype where Flink cannot generate serializers based on the generics. By default DeserializationSchema uses reflection-based type analysis, you need to override the getResultType() method in WormholeDeserializationSchema. And specify the type information manually t

Re: Datastream[Row] covert to table exception

2018-06-06 Thread Timo Walther
Sorry, I didn't see you last mail. The code looks good actually. What is the result of `inputStream.getType` if you print it to the console? Timo Am 07.06.18 um 08:24 schrieb Timo Walther: Hi, Row is a very special datatype where Flink cannot generate serializers based on the generics. By de