Relayr, a Berlin based IoT company is using Apache Flink for processing
its sensor data. We are still in the learning phase, but are commited to
using Flink.
Check out our public job ads for Berlin and Munich:
https://relayr.io/jobs/
If you are interested, even if there is no clearly matching
Runtime error is because you have non-serializable code in your 'map'
operator.
DataStream stockPrices = streamExecEnv.addSource(new
LookupStockPrice(stockSymbol));
stockPrices.print();
The approach that you took will create infinite stockprice sources inside
the ma
I'm trying to build a sample application using Flink that does the following:
1. Reads a stream of stock symbols (e.g. 'CSCO', 'FB') from a Kafka queue
2. For each symbol performs a real-time lookup of current prices and streams
the values
The program compiles fine but I get the following run-time
Hi,
I’ve noticed something peculiar about the relationship between state size and
cluster size and was wondering if anyone here knows of the reason. I am running
a job with 1 hour tumbling event time windows which have an allowed lateness of
7 days. When I run on a 20-node cluster with FsState
The team I work on at Twitter is hiring a Software Engineer:
https://careers.twitter.com/en/work-for-twitter/software-engineer-data-products.html
Dan
On Fri, Dec 16, 2016 at 5:46 AM Kostas Tzoumas wrote:
> Hi folks,
>
> As promised, here is the first thread for Flink-related job positions. If
>
Also, can you tell us what OS you are running on?
On Fri, Dec 16, 2016 at 6:23 PM, Stephan Ewen wrote:
> Hi!
>
> To diagnose this a little better, can you help us with the following info:
>
> - Are you using RocksDB?
> - What is your flink configuration, especially around memory settings?
>
Hi!
To diagnose this a little better, can you help us with the following info:
- Are you using RocksDB?
- What is your flink configuration, especially around memory settings?
- What do you use for TaskManager heap size? Any manual value, or do you
let Flink/Yarn set it automatically based o
The system metrics [1] are only available on a system level, i.e. not for
an individual job.
The reason is that multiple job might run concurrently on the same task
manager JVM process. So it would not be possible to separate their heap
usage.
The same would be true for the approach that monitors t
Hi. Does Apache Flink currently have support for zero down time or the =
ability to do rolling upgrades?
If so, what are concerns to watch for and what best practices might =
exist? Are there version management and data inconsistency issues to =
watch for?=
Hi Folks,
I'm running Flink (1.2-SNAPSHOT nightly) on YARN (Hadoop 2.7.2). A few
hours after I start a streaming job (built using kafka connect 0.10_2.11)
it gets killed seemingly for no reason. After inspecting the logs my best
guess is that YARN is killing containers due to high virtual memory u
Hi Stephan,
Thanks for your answer. Is there a way to get the metrics such as latency
of each message in the stream? For eg. I have a Kafka source, Cassandra
sink and I do some processing in between. I would like to know how long
does it take for each message from the beginning(entering flink str
Hi!
I am not sure there exists a recommended benchmarking tool. Performance
comparisons depend heavily on the scenarios you are looking at: Simple
event processing, shuffles (grouping aggregation), joins, small state,
large state, etc...
As fas as I know, most people try to write a "mock" version
Hi Nick!
In general, checkpoints cannot overtake each other. It can happen (in the
presence of failure/recovery) that a checkpoint is "half complete" and
subsumed by a newer complete checkpoint.
The message "Checkpoint 17 expired before completing" might be correct -
you could check the start time
Hi all,
I am currently playing around with checkpoints to better understand how they
work. I have some questions I hope you can answer.
I am running a simple topology with a source, a map and a sink that writes the
events it receives to a HBase table. The parallelism of the environment is set
t
Looks like this is not specific to the continuous file monitoring, I'm
having the same issue (files in nested directories are not read) when using:
env.readFile(fileInputFormat, "hdfs:///shared/mydir",
FileProcessingMode.PROCESS_ONCE,
-1L)
2016-12-16 11:12 GMT+01:00 Yassine MARZOUGUI :
> Hi all,
Hi folks,
As promised, here is the first thread for Flink-related job positions. If
your organization is hiring people on Flink-related positions do reply to
this thread with a link for applications.
data Artisans is hiring on multiple technical positions. Help us build
Flink, and help our custom
I have reduced the problem to a simple image [1].
Those shown on the image are the streams I have, and the problem now is how
to create a custom window assigner such that objects in B that *don't share*
elements in A, are put together in the same window.
Why? Because in order to create elements i
If it is not what Stephan said, you can check requests directly against the
REST API and see if you experience the same slow down there.
https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/rest_api.html
– Ufuk
On 16 December 2016 at 10:52:18, Yury Ruchin (yuri.ruc...@gmail.co
Hi all,
I'm using the following code to continuously process files from a directory
"mydir".
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
FileInputFormat fileInputFormat = new TextInputFormat(new
Path("hdfs:///shared/mydir"));
fileInputFormat.setNe
Thanks for the clue, Stephan! I will check.
2016-12-16 12:44 GMT+03:00 Stephan Ewen :
> Not sure if that is the problem, but upon first access to a resource, The
> web server extracts the resource from the JAR file and puts it into the
> temp directpry. Maybe that is slow for whatever reason?
>
>
Not sure if that is the problem, but upon first access to a resource, The
web server extracts the resource from the JAR file and puts it into the
temp directpry. Maybe that is slow for whatever reason?
On Thu, Dec 15, 2016 at 8:04 PM, Yury Ruchin wrote:
> Hi,
>
> I'm seeing an issue with the loa
Hey Fabian,
Thanks for the quick reply,
I was looking through the flink metrics [1] but i couldn't find anything in
there how to analyze the environment from start to finish, only for
functions that extend the richmapfunction
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/m
22 matches
Mail list logo