Re: BucketingSink to S3: Missing class com/amazonaws/AmazonClientException

2018-10-03 Thread Julio Biason
oints. > > Best, > Andrey > > [1] https://ci.apache.org/projects/flink/flink-docs-master/ > ops/deployment/aws.html#s3-simple-storage-service > > On 2 Oct 2018, at 15:21, Julio Biason wrote: > > Hey guys, > > I've setup a BucketingSink as a dead letter queu

Re: Cancelled job not showing its details

2018-10-02 Thread Julio Biason
Oh, another piece of information: Because the job was failing and restarting, I did a cancel via the CLI tool during one of the restarts. On Tue, Oct 2, 2018 at 4:03 PM, Julio Biason wrote: > Hello, > > I had a job that was failing -- a bug on our code -- so I decided to > cancel i

Cancelled job not showing its details

2018-10-02 Thread Julio Biason
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

BucketingSink to S3: Missing class com/amazonaws/AmazonClientException

2018-10-02 Thread Julio Biason
er type of dead letter queue that won't work with BucketingSink and I was thinking about using it directly to create files inside that Ceph/S3. -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

JobManager in HA with a single node loses leadership

2018-09-27 Thread Julio Biason
at org.apache.flink.runtime.taskexecutor.TaskExecutor$JobManagerHeartbeatListener.lambda$notifyHeartbeatTimeout$0(TaskExecutor.java:1609) ... 15 more If there is a single JobManager in the cluster... who is taking the leadership? Is that even possible? -- *Julio Biason*, Sofware Engineer *

Re: Trying to figure out why a slot takes a long time to checkpoint

2018-09-24 Thread Julio Biason
oned in my previous mail. I think they might > help more than just logs. Would also be nice if you could create a single > zip file with all the things instead of a bunch of > 50MB logs. > > Best, > Stefan > > Am 20.09.2018 um 21:57 schrieb Julio Biason : > > Hey guys, >

Re: Trying to figure out why a slot takes a long time to checkpoint

2018-09-19 Thread Julio Biason
ZooKeeper, heartbeats and the S3 disconnecting from being idle. Is there anything else that I should change to DEBUG? Akka? Kafka? Haoop? ZooKeeper? (Those are, by the default config, bumped to INFO) All of those? On Tue, Sep 18, 2018 at 12:34 PM, Julio Biason wrote: > Hey TIll (and others), >

Re: Trying to figure out why a slot takes a long time to checkpoint

2018-09-18 Thread Julio Biason
t is linear with the time and size of the previous task >>> adjacent to it in the diagram. >>> I think your real application is concerned about why Flink accesses HDFS >>> so slowly. >>> You can call the DEBUG log to see if you can find any clues, or post the >

Re: Trying to figure out why a slot takes a long time to checkpoint

2018-09-14 Thread Julio Biason
at 8:00 PM, Julio Biason wrote: > Hey guys, > > On our pipeline, we have a single slot that it's taking longer to create > the checkpoint compared to other slots and we are wondering what could be > causing it. > > The operator in question is the window metric -- the only

Re: Why FlinkKafkaConsumer doesn't subscribe to topics?

2018-09-04 Thread Julio Biason
use flink kafka consumer and we can monitor it > correctly. > > On Tue, Sep 4, 2018 at 3:09 AM Julio Biason > wrote: > >> Hey guys, >> >> We are trying to add external monitoring to our system, but we can only >> get the lag in kafka topics while the Flink j

Why FlinkKafkaConsumer doesn't subscribe to topics?

2018-09-03 Thread Julio Biason
eason for not subscribing to topics that I may have missed? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Re: Counting elements that appear "behind" the watermark

2018-08-01 Thread Julio Biason
lt;https://github.com/apache/flink/blob/a0f4239faee533545c6d923a944f242b519759d1/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/TimerService.java#L39> > . > > On Tue, Jul 31, 2018 at 7:45 AM Julio Biason > wrote: > >> Thanks for the tips. Unfortunately, it seems `Context` o

Re: Counting elements that appear "behind" the watermark

2018-07-31 Thread Julio Biason
rocessElement method, the > watermark generated from that element won't be the current watermark. > > On Mon, Jul 30, 2018 at 10:33 AM Julio Biason > wrote: > >> Hello, >> >> Our current watermark model is "some time behind the most recent seen >> elemen

Re: flink_taskmanager_job_task_operator_records_lag_max == -Inf on Flink 1.4.2

2018-07-30 Thread Julio Biason
about 2 minutes to write 500 records. I opened the ticket https://issues.apache.org/jira/browse/FLINK-9998 with a bit more information about this ('cause I completely forgot to open a ticket a month ago about this). On Thu, Jun 14, 2018 at 11:31 AM, Julio Biason wrote: > Hey Gordon, >

Counting elements that appear "behind" the watermark

2018-07-30 Thread Julio Biason
iodicWatermarks 'cause it has no `getRuntime()` to attach the metric. Is there any way we can count those (a ProcessFunction before the .assignTimestampsAndWatermarks(), maybe)? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

"Futures timed out" when trying to cancel a job with savepoint

2018-07-25 Thread Julio Biason
size taking too long to be saved. -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Re: flink_taskmanager_job_task_operator_records_lag_max == -Inf on Flink 1.4.2

2018-06-13 Thread Julio Biason
2fee8c4fc7d55f2",subtask_index="8",task_attempt_id="f36fe63b0688a821f5abf685551c47fa",task_attempt_num="11",task_id="cbc357ccb763df2852fee8c4fc7d55f2",tm_id="9098e39a467aa6c255dcf2ec44544cb2"} 83842 What we have are 3 servers running with 4 slot

Re: flink_taskmanager_job_task_operator_records_lag_max == -Inf on Flink 1.4.2

2018-06-13 Thread Julio Biason
; The above should provide more insight into what may be wrong here. > > - Gordon > > [1] https://issues.apache.org/jira/browse/FLINK-8419 > [2] https://docs.confluent.io/current/kafka/monitoring.html#fetch-metrics > > On 12 June 2018 at 11:47:51 PM, Julio Biason (julio.bia...@a

flink_taskmanager_job_task_operator_records_lag_max == -Inf on Flink 1.4.2

2018-06-12 Thread Julio Biason
one, flink_taskmanager_job_task_operator_records_lag_max is now returning -Inf. Did I miss something? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-03 Thread Julio Biason
me issue with the new ResourceManager... On Thu, May 3, 2018 at 11:00 AM, Julio Biason wrote: > Hey Gary, > > Yes, I was still running with the `-m` flag on my dev machine -- partially > configured like prod, but without the HA stuff. I never thought it could be > a problem, since ev

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-03 Thread Julio Biason
ogs: > Unfortunately, the error message in your previous email is different. If > the > above does not solve your problem, can you attach the logs of the client > and > JobManager? > > Lastly, what Flink version are you running? > > Best, > Gary > > On Wed, May 2,

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-02 Thread Julio Biason
Exception at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:435) ... 15 more On Wed, May 2, 2018 at 9:52 AM, Julio Bia

Cannot submit jobs on a HA Standalone JobManager

2018-05-02 Thread Julio Biason
o far, I have two different machines running the JobManager and, looking at the logs, I can't see any problem whatsoever to explain why the flink command is refusing to run the pipeline... Any ideas where I should look? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerat

Testing Metrics

2018-04-24 Thread Julio Biason
) but now I'm wondering if there is a way I can integrate this into the test itself. Possible? Not possible? Ideas? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Re: Trying to understand KafkaConsumer_records_lag_max

2018-04-16 Thread Julio Biason
e the consumers effectively jump directly > back to the head again. > > Cheers, > Gordon > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/ > -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Trying to understand KafkaConsumer_records_lag_max

2018-04-13 Thread Julio Biason
ing environment, not production, so we are using smaller machines with few cores [2] and low memory [8Gb]) -- attached Grafana graph for reference. I don't see any checkpoint errors or taskmanager failures, so I don't think it simply dropped everything and started over. Any ideas wha

Re: Side outputs never getting consumed

2018-04-04 Thread Julio Biason
ag, value) > ctx.output(outputTag2, new TestingClass) > ctx.output(outputTag2, new TestA) > } > }) > > counts.getSideOutput(outputTag).print() > counts.getSideOutput(outputTag2).print() > > // execute program env.execute("Stream

Side outputs never getting consumed

2018-04-02 Thread Julio Biason
problem: It seems .getSideOutput() is never actually getting the side output because a the logger in AccoutingSink.toRow() is never happening and the data is not showing on our database (toRow() convers the Metric to a Row and accountingSInk.output returns the JDBCOutputFormat). Any ideas what I need

Re: Out of the blue: "Cannot use split/select with side outputs"

2018-03-19 Thread Julio Biason
Update: Even weirder, I stopped Flink (jobmanager and taskmanager) to increase the number of slots and, upon restart, it crashed again and then processed everything just fine. On Mon, Mar 19, 2018 at 3:01 PM, Julio Biason wrote: > Hey guys, > > I got a weird problem with my pipeline

Out of the blue: "Cannot use split/select with side outputs"

2018-03-19 Thread Julio Biason
arm. We didn't update Flink (still running 1.4.0) so I'm really confused on what's going on here. Any ideas? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Ceph configuration for checkpoints?

2018-02-12 Thread Julio Biason
s and secret keys? I couldn't find anything remotely related to that in the docs... -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Extending Flink Slots when running on Yarn

2018-02-01 Thread Julio Biason
start Flink on it. But say I want to extend this cluster to add more machines; in this case, simply adding more machines to the Hadoop cluster will work? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Re: cleaning yarn logs for long-running applications

2018-01-15 Thread Julio Biason
8:12 AM, Soheil Pourbafrani wrote: > Hi, I want to use Yarn as cluster manager for running Flink applications, > but I'm worried about how Flink or Yarn handle local logs in each machine. > Does they clean aged logs for a long-running application? If not, it's > possible the loca

Why did you pick Scala/Java for your project?

2017-12-15 Thread Julio Biason
about the people _building_ Flink stuff.) -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*

Re: [Docs] Can't add metrics to RichFilterFunction

2017-12-14 Thread Julio Biason
Oh, obviously, code is Scala. Also we are using Flink 1.4.0 and flink-metrics-core-1.4-SNAPSHOT. On Thu, Dec 14, 2017 at 10:56 AM, Julio Biason wrote: > Hello, > > I'm trying to add a metric to a filter function, but following the example > in the docs is not working. > &g

[Docs] Can't add metrics to RichFilterFunction

2017-12-14 Thread Julio Biason
r:Counter [warn]^ Any ideas? Are the docs wrong? -- *Julio Biason*, Sofware Engineer *AZION* | Deliver. Accelerate. Protect. Office: +55 51 3083 8101 | Mobile: +55 51 *99907 0554*