I'm sorry, I missed some important informations. I use Spark version 2.0.2
in Scala 2.11.8.
2017-03-14 13:44 GMT+01:00 Julian Keppel <juliankeppel1...@gmail.com>:
> Hi everybody,
>
> I make some experiments with the Spark kmeans implementation of the new
> DataFrame-API. I
Hi everybody,
I make some experiments with the Spark kmeans implementation of the new
DataFrame-API. I compare clustering results of different runs with
different parameters. I recognized that for random initialization mode, the
seed value is the same every time. How is it calculated? In my
Hi,
I use Spark 2.0.2 and want to do the following:
I extract features in a streaming job and than apply the records to a
k-means model. Some of the features are simple ones which are calculated
directly from the record. But I also have more complex features which
depend on records from a
; On Fri, Nov 18, 2016 at 4:38 AM, Julian Keppel
> <juliankeppel1...@gmail.com> wrote:
> > Hello,
> >
> > I use Spark 2.0.2 with Kafka integration 0-8. The Kafka version is
> 0.10.0.1
> > (Scala 2.11). I read data from Kafka with the direct approach. The
> compl
I do research in anomaly detection with methods of machine learning at the
moment. And currently I do kmeans clustering, too in an offline learning
setting. In further work we want to compare the two paradigms of offline
and online learning. I would like to share some thoughts on this
disscussion.
Hello,
I use Spark 2.0.2 with Kafka integration 0-8. The Kafka version is 0.10.0.1
(Scala 2.11). I read data from Kafka with the direct approach. The complete
infrastructure runs on Google Container Engine.
I wonder why the corresponding application UI says the input rate is zero
records per
Okay, thank you! Can you say, when this feature will be released?
2016-10-13 16:29 GMT+02:00 Cody Koeninger :
> As Sean said, it's unreleased. If you want to try it out, build spark
>
> http://spark.apache.org/docs/latest/building-spark.html
>
> The easiest way to include
Yes, but what they do is to only add new elements to a state which is
passed as parameter. But my problem is that my "counter" (the hyperloglog
object) comes from outside and is not passed to the function. So i have to
track the state of this "external" hll object accross the whole lifecycle
of