Hello,
Another issue I have encountered is incorrect implicit resolution (I'm
using Scala 2.11.7). Here's the example (with a workaround):
val a = env.fromCollection(Seq(Thing("a", "b"), Thing("c", "d")))
val b = env.fromCollection(Seq(Thing("a", "x"), Thing("z", "m")))
a.coGroup(b)
.where(e
U may want to use FlinkMLTools.persist() methods which use
TypeSerializerFormat and don't enforce IOReadableWritable.
On Tue, Mar 29, 2016 at 2:12 PM, Sourigna Phetsarath <
gna.phetsar...@teamaol.com> wrote:
> Till,
>
> Thank you for your reply.
>
> Having this issue though, WeightVector does
Till,
Thank you for your reply.
Having this issue though, WeightVector does not extend IOReadWriteable:
*public* *class* SerializedOutputFormat<*T* *extends* IOReadableWritable>
*case* *class* WeightVector(weights: Vector, intercept: Double) *extends*
Serializable {}
However, I will use the
There is some more detail to this question that I missed initially. It
turns out that my key is a case class of a form MyKey(k1: Option[String],
k2: Option[String]). Keys.SelectorFunctionKeys is performing a recursive
check whether every element of the MyKey class can be a key and fails when
H i am new here...
I am trying to implement online k-means as here
https://databricks.com/blog/2015/01/28/introducing-streaming-k-means-in-spark-1-2.html
with flink.
I dont see anywhere a withBroadcastSet call to save intermediate results is
this currently supported?
Is intermediate results
Hello,
I'm evaluating Flink and one thing I noticed is Option[A] can't be used as
a key for coGroup (looking specifically here:
https://github.com/apache/flink/blob/master/flink-scala/src/main/scala/org/apache/flink/api/scala/typeutils/OptionTypeInfo.scala#L39).
I'm not clear about the reason of
Hi Gwenhael,
That is not possible right now. As a workaround, you could have three
DataSets that are constructed by reading recursively from each directory
and unify these later. Alternatively, moving/linking the directories in a
different location would also work.
I agree that it would be nice
Thanks
Copying jars into the lib directory works fine
On Tue, Mar 29, 2016, at 13:34, Balaji Rajagopalan wrote:
> You will have to include dependent jackson jar in flink server lib
> folder, or create a fat jar.
>
> balaji
>
> On Tue, Mar 29, 2016 at 4:47 PM, Bart van Deenen
>
Perhaps there is a misunderstanding on my side over the parallelism and
split management given a data source.
We started from the current JDBCInputFormat to make it multi-thread. Then,
given a space of keys, we create the splits based on a fetchsize set as a
parameter. In the open, we get a
Hi Zach,
For using upsert in ES2, I guess it looks like as follows? However I cannot
find which method in Request returns UpdateRequest while
Requests.indexRequest() returns IndexRequest. Can I ask did you know it?
public static UpdateRequest updateIndexRequest(String element) {
That is exactly my point. I should have 32 threads running, but I have only
8. 32 Task are created, but only only 8 are run concurrently. Flavio and I
will try to make a simple program to produce the problem. If we solve our
issues on the way, we'll let you know.
thanks a lot anyway.
saluti,
Hi all
I've succesfully built a Flink streaming job, and it runs beautifully in
my IntelliJ ide, with a Flink instance started on the fly. The job eats
Kafka events, and outputs to file. All the i/o is json encoded with
Jackson.
But I'm having trouble with deploying the jar on a Flink server
In fact, I don't use it. I just had to crawl back the runtime
implementation to get to the point where parallelism was switching from 32
to 8.
saluti,
Stefano
2016-03-29 12:24 GMT+02:00 Till Rohrmann :
> Hi,
>
> for what do you use the ExecutionContext? That should
Hi,
for what do you use the ExecutionContext? That should actually be something
which you shouldn’t be concerned with since it is only used internally by
the runtime.
Cheers,
Till
On Tue, Mar 29, 2016 at 12:09 PM, Stefano Bortoli
wrote:
> Well, in theory yes. Each task
Well, in theory yes. Each task has a thread, but only a number is run in
parallel (the job of the scheduler). Parallelism is set in the
environment. However, whereas the parallelism parameter is set and read
correctly, when it comes to actual starting of the threads, the number is
fix to 8. We
Hey Stefano,
this should work by setting the parallelism on the environment, e.g.
env.setParallelism(32)
Is this what you are doing?
The task threads are not part of a pool, but each submitted task
creates its own Thread.
– Ufuk
On Fri, Mar 25, 2016 at 9:10 PM, Flavio Pompermaier
Hi,
I think Ufuk is completely right. As far as I know, we don't support this
function and nobody's currently working on it. If you like, then you could
take the lead there.
Cheers,
Till
On Mon, Mar 28, 2016 at 10:50 PM, Ufuk Celebi wrote:
> Hey Gna! I think that it's not on
Hi Gna,
there are no utilities yet to do that but you can do it manually. In the
end, a model is simply a Flink DataSet which you can serialize to some
file. Upon reading this DataSet you simply have to give it to your
algorithm to be used as the model. The following code snippet illustrates
this
To my knowledge there is nothing like that. PMML is not supported in any
form and there's no custom saving format yet. If you really need a quick
and dirty solution, it's not that hard to serialize the model into a file.
2016-03-28 17:59 GMT+02:00 Sourigna Phetsarath :
Hi,
which version of Flink are you using and do you have a custom timestamp
extractor/watermark extractor? The semantics of this changed between 0.10
and 1.0 and I just want to make sure that you get the correct behavior.
Cheers,
Aljoscha
On Tue, 29 Mar 2016 at 10:13 Bart van Deenen
Hi all
I'm doing a fold on a sliding window, using
TimeCharacteristic.EventTime. For output I'm picking the timestamp of
the most recent event in the window, and use that to name the output (to
a file).
My question is: will a second run of Flink on the same set of data (from
Kafka) put the same
21 matches
Mail list logo