Hi,
I am using DataSet API and reading Avro files as DataSet. I
am seeing this weird behavior that record is read correctly from file
(verified by printing all values) but when when this record is passed to
Flink chain/DAG (e.g. KeySelector), every field in this record has the same
value as the fi
Hi Rafal,
>From your description, it seems like Flink is complaining because it cannot
access the Elasticsearch API related dependencies as well. You'd also have
to include the following into your Maven build, under :
org.elasticsearch
elasticsearch
2.3.2
jar
false
${proj
Thanks a lot for many answer :)
Last time I write this email cause I don't understand what is the
difference between LOCAL cluster (one node) and IntelliJ IDEA. Now I know :P
https://ci.apache.org/projects/flink/flink-docs-master/apis/cluster_execution.html
If you read this *"Linking with modul
Hi,
Are you sure the elastic cluster is running correctly?
Open a browser and try 127.0.0.1:9200 that should give you the overview of
the cluster. If you don't get it there is something wrong with the setup.
Its also a good way to double check the cluster.name (I got that wrong more
than once)
I
Hej,
I have a windowed stream and I want to run a (generic) fold function on it.
The result should have the start and the end time stamp of the window as
fields (so I can relate it to the original data). *Is there a simple way to
get the timestamps from within the fold function?*
I could find the
Hi,
are you per change using Kafka 0.9?
Cheers,
Aljoscha
On Tue, 10 May 2016 at 08:37 Balaji Rajagopalan <
balaji.rajagopa...@olacabs.com> wrote:
> Robert,
> Regarding the event qps 4500 events/sec may not be large no, but I am
> seeing some issue in processing the events due to processing pow
Seeing how you put a loopback address into the transport addresses, are you
sure that an ElasticSearch node runs on every machine?
On Wed, May 11, 2016 at 7:41 PM, Stephan Ewen wrote:
> ElasticSearch is basically saying that it cannot connect.
>
> Is it possible that the configuration of elastic
ElasticSearch is basically saying that it cannot connect.
Is it possible that the configuration of elastic may be incorrect, or some
of the ports may be blocked?
On Mon, May 9, 2016 at 7:05 PM, rafal green wrote:
> Dear Sir or Madam,
>
> Can you tell me why I have a problem with elasticsearch
Just to narrow down the problem:
The insertion into HBase actually works, but the job does not finish after
that?
And the same job (same source of data) that writes to a file, or prints,
finishes?
If that is the case, can you check what status each task is in, via the web
dashboard? Are all tasks
I can't help you with the choice of the db storage, as always the answer is
"it depends" on a lot of factors :)
For what I can tell you the problem could be that Flink support HBase 0.98,
so it could worth to update Flink connectors to a more recent version (that
should be backward compatible hope
Hello,
We're implementing a streaming outer join operator based on a
TwoInputStreamOperator with an internal buffer. In our use-case only the
items whose timestamps are within a several-second interval of each other
can join, so we need to synchronize the two input streams to ensure maximal
yield.
Hadoop 2.7.2
HBase 1.2.1
I have this running from a Hadoop job, but just not from Flink.
I will look into your suggestions, but would I be better off choosing
another DB for storage? I can see that Cassandra gets some attention in
this mailing list. I need to store app 2 bio key value pairs consi
And which version of HBase and Hadoop are you running?
Did you try to put the hbase-site.xml in the jar?
Moreover, I don't know how much reliable is at the moment the web client
UI..my experience is that the command line client is much more reliable.
You just need to run from the flink dir somethi
Actually model portability and persistence is a serious limitation to
practical use of FlinkML in streaming. If you know what you're doing, you
can write a blunt serializer for your model, write it in a file and rebuild
the model stream-side with deserialized informations.
I tried it for an SVM mo
I run the job from the cluster. I run it through the web UI.
The jar file submitted does not contain the hbase-site.xml file.
- Original meddelelse -
> Fra: Flavio Pompermaier
> Til: user
> Dato: Ons, 11. maj 2016 09:36
> Emne: Re: HBase write problem
>
> Do you run the job from your I
Currently I am not aware of streaming learners support, you would need to
implement that yourself at this point.
As for streaming predictors for batch learners I have some preview code
that you might like. [1]
[1]
https://github.com/streamline-eu/ML-Pipelines/blob/314e3d940f1f1ac7b762ba96067e13d8
Hi Márton,
I want to train and get the residuals.
Thanks and Regards,Piyush Shrivastava
http://webograffiti.com
On Wednesday, 11 May 2016 7:19 PM, Márton Balassi
wrote:
Hey Piyush,
Would you like to train or predict on the streaming data?
Best,
Marton
On Wed, May 11, 2016 at 3:44 PM,
Hey Piyush,
Would you like to train or predict on the streaming data?
Best,
Marton
On Wed, May 11, 2016 at 3:44 PM, Piyush Shrivastava
wrote:
> Hello all,
>
> I want to perform linear regression using FlinkML's
> MultipleLinearRegression() function on streaming data.
>
> This function takes a
Hello all,
I want to perform linear regression using FlinkML's MultipleLinearRegression()
function on streaming data.
This function takes a DataSet as an input and I cannot create a DataSet inside
the MapFunction of a DataStream. How can I use this function on my DataStream?
Thanks and Regards,
Hi everyone,
I observed the following behavior with Flink 1.0.2 on Hadoop 2.4.1 with
a yarn session in HA mode:
2016-05-10 18:39:14,546 INFO org.apache.zookeeper.ClientCnxn
- Client session timed out, have not heard from
server in 52444ms for sessionid 0x2544821cf2f818a, clos
Hi Simone,
Fabian has pushed a fix for the streaming TableSources that removed the
Calcite Stream rules [1].
The reported error does not appear anymore with the current master. Could
you please also give it a try and verify that it works for you?
Thanks,
-Vasia.
[1]:
https://github.com/apache/fl
Hi Leon!
I agree with Aljoscha that the term "microbatches" is confusing in that
context. Flink's network layer is "buffer" oriented rather than "record
oriented". Buffering it is a best effort to gather some elements in case
where they come fast enough that this would not add much latency anyways
2016-05-09 14:56 GMT+02:00 Simone Robutti :
> >- You wrote you'd like to "instantiate a H2O's node in every task
> manager". This reads a bit like you want to start H2O in the TM's JVM , but
> I would assume that a H2O node runs as a separate process. So should it be
> started inside the TM JVM or
Hi Gyula,
I tried doing something like the following in the 2 flatmaps, but i am not
getting desired results and still confused how the concept you put forward
would work:
public static final class MyCoFlatmap implements CoFlatMapFunction{
Centroid[] centroids;
Do you run the job from your IDE or from the cluster?
On Wed, May 11, 2016 at 9:22 AM, Palle wrote:
> Thanks for the response, but I don't think the problem is the classpath -
> hbase-site.xml should be added. This is what it looks like (hbase conf is
> added at the end):
>
> 2016-05-11 09:16:45
Hi,
latency for Flink and Storm are pretty similar. The only reason I could see
for Flink having the slight upper hand there is the fact that Storm tracks
the progress of every tuple throughout the topology and requires ACKs that
have to go back to the sinks.
As for throughput you are right that F
Thanks for the response, but I don't think the problem is the classpath -
hbase-site.xml should be added. This is what it looks like (hbase conf is
added at the end):
2016-05-11 09:16:45,831 INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.class.path=C:\systems\packages\flink-1.0.2\li
27 matches
Mail list logo