flink read hdfs file error

2018-01-22 Thread ??????
Dear All I have a question about Flink&Hadoop. I want to read the files on HDFS by flink,but I encountered an error as follows,can you please advise the solution about this problem. It will be much appreciated.: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF

Re: Far too few watermarks getting generated with Kafka source

2018-01-22 Thread Fabian Hueske
Hi William, The TsExtractor looks good. This sounds like a strange behavior and should not (or only indirectly) be related to the Kafka source since the WMs are generated by a separate extractor. - Did you compare the first (and only) generated watermark to the timestamps of the records that are

java.lang.ClassCastException: oracle.sql.TIMESTAMP cannot be cast to java.sql.Timestamp

2018-01-22 Thread Puneet Kinra
Hi I am getting the above error when deployed to the cluster ,trying to set the System Property but not getting reflected inside the jobs. I need to schedule the job as well on periodic basis , i was thinking of calling the jar from the CLI & put into script & schedule using cron job in linux bu

Re: java.lang.ClassCastException: oracle.sql.TIMESTAMP cannot be cast to java.sql.Timestamp

2018-01-22 Thread Timo Walther
Hi Puneet, Flink SQL does only supports java.sql.Timestamp. You need to convert it in a user-defined function or map function accordingly. Regards, Timo Am 1/22/18 um 11:38 AM schrieb Puneet Kinra: Hi I am getting the above error when deployed to the cluster ,trying to set the System Prop

Re: java.lang.ClassCastException: oracle.sql.TIMESTAMP cannot be cast to java.sql.Timestamp

2018-01-22 Thread Timo Walther
According to Stackoverflow (https://stackoverflow.com/questions/13269564/java-lang-classcastexception-oracle-sql-timestamp-cannot-be-cast-to-java-sql-ti) setting the system property accordingly should work. Maybe you can share how you do it? Regards, Timo Am 1/22/18 um 11:45 AM schrieb Punee

Re: BucketingSink broken in flink 1.4.0 ?

2018-01-22 Thread Stephan Ewen
Hi! Thanks for diagnosing this - the fix you suggested is correct. Can you still share some of the logs indicating where it fails? The reason is that the fallback code path (using "hdfs://localhost:12345") should not really try to connect to a local HDFS, but simply use this as a placeholder URI

akka.pattern.AskTimeoutException: Ask timed out (after upgrading to flink 1.4.0)

2018-01-22 Thread Bart Kastermans
I have upgraded to flink-1.4.0, with just local task and job manager (flink/bin/start-cluster.sh). After solving the dependency issues, I now get the below error consistently on a specific job. As this means absolutely nothing to me (other than that I realise flink uses akka), I have no idea w

ElasticsearchSink in Flink 1.4.0 with Elasticsearch 5.2+

2018-01-22 Thread Adrian Vasiliu
Hello,   With a local run of Flink 1.4.0, ElasticsearchSink fails for me with a local run of Elasticsearch 5.6.4 and 5.2.1, while the same code (with adjusted versions of dependencies) works fine with Elasticsearch 2.x (tried 2.4.6). I get:java.lang.NoSuchMethodError: org.elasticsearch.action.bulk.

GetExecutionPlan fails with IllegalArgumentException in Comparator

2018-01-22 Thread Bauss, Julian
Hello everybody, we‘re currently encountering an exception while generating an ExecutionGraph JSON in Flink v1.3.2. Actually executing the job does not cause an exception and everything works as inteded. This happens since we started adding side-outputs to many of our operators. Is this alread

Re: NoClassDefFoundError of a Avro class after cancel then resubmit the same job

2018-01-22 Thread Edward
Yes, we've seen this issue as well, though it usually takes many more resubmits before the error pops up. Interestingly, of the 7 jobs we run (all of which use different Avro schemas), we only see this issue on 1 of them. Once the NoClassDefFoundError crops up though, it is necessary to recreate th

Unable to query MapState

2018-01-22 Thread Velu Mitwa
Hi, I am trying to query Flink's MapState from Flink client (1.3.2). I was able to query ValueState but when I tried to query MapState I am getting an exception. java.io.IOException: Unconsumed bytes in the deserialized value. This indicates a mismatch in the value serializers used by the KvState

Error with Avro/Kyro after upgrade to Flink 1.4

2018-01-22 Thread Edward
(resubmission of a previous post, since the stack trace didn't show up last time) We're attempting to upgrade our 1.3.2 cluster and jobs to 1.4.0. When submitting jobs to the 1.4.0 Kafka cluster, they fail with a Kryo registration error. My jobs are consuming from Kafka topics with messages in

Re: GetExecutionPlan fails with IllegalArgumentException in Comparator

2018-01-22 Thread Fabian Hueske
Hi Julian, I searched for the issue in JIRA [1] but did not find a corresponding issue. Could you open an issue for this bug? Thank you, Fabian [1] https://issues.apache.org/jira/projects/FLINK/summary 2018-01-22 14:11 GMT+01:00 Bauss, Julian : > Hello everybody, > > > > we‘re currently encoun

Re: Error with Avro/Kyro after upgrade to Flink 1.4

2018-01-22 Thread Edward
Also, I'm not sure if this would cause the uninitialized error, but I did notice that in the maven dependency tree there are 2 different versions of kyro listed as Flink dependencies: flink-java 1.4 requires kyro 2.24, but flink-streaming-java_2.11 requires kyro 2.21: [INFO] +- org.apache.flink:f

Re: ElasticsearchSink in Flink 1.4.0 with Elasticsearch 5.2+

2018-01-22 Thread Fabian Hueske
Hi Adrian, thanks for raising this issue again. I agree, we should add support for newer ES versions. I've added 1.5.0 as target release for FLINK-7386 and bumped the priority up. In the meantime, you can try Flavio's approach (he responded to the mail thread you linked) and fork and fix the conn

Re: ElasticsearchSink in Flink 1.4.0 with Elasticsearch 5.2+

2018-01-22 Thread Adrian Vasiliu
OK, thanks a lot Fabian. Adrian   - Original message -From: Fabian Hueske To: Adrian Vasiliu Cc: user Subject: Re: ElasticsearchSink in Flink 1.4.0 with Elasticsearch 5.2+Date: Mon, Jan 22, 2018 2:54 PM  Hi Adrian, thanks for raising this issue again.I agree, we should add support for newer

Trying to understand why a job is spilling despite of huge memory provided

2018-01-22 Thread Konstantin Gregor
Hello everyone, I have a question about the spilling behavior of a Flink batch job. The relevant part is a standard map-reduce, aggregating 4 billion Tuple3 together via a groupBy(0,1).sum(2). And there really doesn't happen much else in the job. The problem is that I don't understand why this j

Re: Unable to query MapState

2018-01-22 Thread Kostas Kloudas
Hi Velu, I would recommend to switch to Flink 1.4 as the queryable state has been refactored to be compatible with all types of state. You can read more here: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/queryable_state.html

Flink Kinesis Consumer re-reading merged shards upon restart

2018-01-22 Thread Philip Luppens
Hi everyone, For the past weeks, we’ve been struggling with Kinesis ingestion using the Flink Kinesis connector, but the seemingly complete lack of similar reports makes us wonder if perhaps we misconfigured or mis-used the connector. We’re using the connector to subscribe to streams varying from

Re: Flink Kinesis Consumer re-reading merged shards upon restart

2018-01-22 Thread Tzu-Li (Gordon) Tai
Hi Philip, Thanks a lot for reporting this, and looking into this in detail. Your observation sounds accurate to me. The `endingSequenceNumber` would no longer be null once a shard is closed, so on restore that would mistaken the consumer to think that it’s a new shard and start consuming it fr

Re: Flink Kinesis Consumer re-reading merged shards upon restart

2018-01-22 Thread Philip Luppens
Hi Gordon, Yeah, I’d need to confirm with our devops guys that this is the case (by default, the Kinesis monitoring doesn’t show how many/which shards were re-ingested, all I remember is seeing the iterator age shooting up again to the retention horizon, but no clue if this was because of 1 shard,

Should multiple apache flink task managers have strong identity? Also, should I point their state.checkpoints.dir to the same HDFS?

2018-01-22 Thread Felipe Cavalcanti
Hi, I'm deploying flink to kubernetes and I've some doubts... First one is if the task managers should have strong identity (in which case I will use statefulsets for deploying them). Second one is if I should point rocksdb state.checkpoint.dir in all task managers to the same HDFS path or if eac

state.checkpoints.dir

2018-01-22 Thread Biswajit Das
Hello , Is there any hack to supply *state.checkpoints.*dir as argument or JVM parameter when running locally . I can change the source *CheckpointCoordinator* and make it work , trying to find if there is any shortcuts ?? Thank you ~ Biswajit

Re: state.checkpoints.dir

2018-01-22 Thread Hao Sun
We generate flink.conf on the fly, so we can use different values based on environment. On Mon, Jan 22, 2018 at 12:53 PM Biswajit Das wrote: > Hello , > > Is there any hack to supply *state.checkpoints.*dir as argument or JVM > parameter when running locally . I can change the source > *Checkp

Re: Far too few watermarks getting generated with Kafka source

2018-01-22 Thread Eron Wright
I think there's a misconception about `setAutowatermarkInterval`. It establishes the rate at which your periodic watermark generator is polled for the current watermark. Like most generators, `BoundedOutOfOrdernessTimestampExtractor` produces a watermark based solely on observed elements. The

System.exit() vs throwing exception from the pipeline

2018-01-22 Thread Gordon Weakliem
What's the general advice on calling System.exit() inside an operator, vs throwing an exception and having the execution environment tear down the pipeline. Throwing the exception seems cleaner but it does appear that Flink might do an orderly shutdown with System.exit(). Will the close() methods b

Flink streaming (1.3.2) KafkaConsumer08 - Unable to retrieve any partitions

2018-01-22 Thread Tarandeep Singh
Hi, Our flink streaming job that is reading from old version of Kafka keeps failing (every 9 minutes or so) with this error: java.lang.RuntimeException: Unable to retrieve any partitions for the requested topics [extracted-dimensions]. Please check previous log entries at org.apache.flink