Re: SIGMOD System Award for Apache Spark

2022-05-13 Thread Manolis Gemeliaris
Congratulations everyone !

Στις Παρ 13 Μαΐ 2022 στις 8:06 μ.μ., ο/η Xingbo Jiang 
έγραψε:

> Congratulations!
>
> On Fri, May 13, 2022 at 9:43 AM Xiao Li 
> wrote:
>
>> Congratulations to everyone!
>>
>> Xiao
>>
>> On Fri, May 13, 2022 at 9:34 AM Dongjoon Hyun 
>> wrote:
>>
>>> Ya, it's really great!. Congratulations to the whole community!
>>>
>>> Dongjoon.
>>>
>>> On Fri, May 13, 2022 at 8:12 AM Chao Sun  wrote:
>>>
 Huge congrats to the whole community!

 On Fri, May 13, 2022 at 1:56 AM Wenchen Fan 
 wrote:

> Great! Congratulations to everyone!
>
> On Fri, May 13, 2022 at 10:38 AM Gengliang Wang 
> wrote:
>
>> Congratulations to the whole spark community!
>>
>> On Fri, May 13, 2022 at 10:14 AM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> Congrats Spark community!
>>>
>>> On Fri, May 13, 2022 at 10:40 AM Qian Sun 
>>> wrote:
>>>
 Congratulations !!!

 2022年5月13日 上午3:44,Matei Zaharia  写道:

 Hi all,

 We recently found out that Apache Spark received
  the SIGMOD System Award
 this year, given by SIGMOD (the ACM’s data management research
 organization) to impactful real-world and research systems. This puts 
 Spark
 in good company with some very impressive previous recipients
 . This
 award is really an achievement by the whole community, so I wanted to 
 say
 congrats to everyone who contributes to Spark, whether through code, 
 issue
 reports, docs, or other means.

 Matei



>>
>> --
>>
>>


An online kmeans algorithm for Spark

2022-05-05 Thread Manolis Gemeliaris
Hello everyone on the Dev team of Apache Spark.

My name is Manolis Gemeliaris and I am a student at the Hellenic
Mediterranean University (former TEI of Crete). For my thesis project I
would like to add an online kmeans algorithm (paper
<https://arxiv.org/abs/1412.5721> (Edo Liberty et al) and python
implementation <https://github.com/sviri/kmeans/tree/main/onlineKmeans/src>
(by the authors)) to Apache Spark.
As I have already read it is a really big procedure to get something like
this officially accepted and it can take a long time to achieve. So I would
like to do it as an Open Source 3rd party package instead, that would be
compatible with  Apache Spark 3.
I have already read the contribution guidelines for Spark and taken some
time studying the code on github.

I would like to ask if anyone can find the time to help me get started. Of
course I realize that your time is of importance, so just any tips that you
can share would be greatly appreciated.

Thank you in advance,
Best Regards,
Manolis Gemeliaris


Add a machine learning algorithm to sparkml

2017-10-20 Thread Manolis Gemeliaris
Hello everyone,

I am an undergraduate student and now looking to do my final year
project. Professor
Minos Garofalakis    suggested to me
that as a  project , I could find a machine learning  algorithm not
implemented by anyone ,in Spark.ml and implement it.
As the topic is related to contributing code (an algorithm implementation)
to Spark, I address to you also.
My question to  you is , are there any suggestions about what algorithm is
missing from spark.ml currently that would be a good option to implement?
(e.g. k-means and lda are already there and so is lsvm)

Thanks in advance.