Re: First Time contribution.

2023-09-17 Thread Haejoon Lee
Welcome Ram! :-) I would recommend you to check https://issues.apache.org/jira/browse/SPARK-37935 out as a starter task. Refer to https://github.com/apache/spark/pull/41504, https://github.com/apache/spark/pull/41455 as an example PR. Or you can also add a new sub-task if you find any error

Re: First Time contribution.

2023-09-17 Thread Denny Lee
Hi Ram, We have some good guidance at https://spark.apache.org/contributing.html HTH! Denny On Sun, Sep 17, 2023 at 17:18 ram manickam wrote: > > > > Hello All, > Recently, joined this community and would like to contribute. Is there a > guideline or recommendation on tasks that can be

Re: About contribution

2022-01-06 Thread Dennis Jung
Oh, yes. Also check on that. I just want to know if there's a bit more detail about contribution, because not just for contribution, but also want to know more deeply of spark project. - To review the base code, what is a good point to start? - Or recommending a blog post or document

Re: About contribution

2022-01-05 Thread Sean Owen
park in work, and try to make a > contribution to this project. > > I'm currently looking at documents in more detail, and checking the issue > in JIRA now. Is there some suggestion of reviewing the code? > > - Which code part will be good to start? > - What will be more helpful for the project? > > Thanks. >

About contribution

2022-01-04 Thread Dennis Jung
Hello, I hope this is not a silly question. (I couldn't find any chat room on spark project, so asking on mail) It has been about a year since using spark in work, and try to make a contribution to this project. I'm currently looking at documents in more detail, and checking the issue in JIRA

Re: Apache Training contribution for Spark - Feedback welcome

2019-07-30 Thread Lars Francke
On Mon, Jul 29, 2019 at 2:46 PM Sean Owen wrote: > TL;DR is: take the below as feedback to consider, and proceed as you > see fit. Nobody's suggesting you can't do this. > > On Mon, Jul 29, 2019 at 2:58 AM Lars Francke > wrote: > > The way I read your point is that anyone can publish material

Re: Apache Training contribution for Spark - Feedback welcome

2019-07-29 Thread Sean Owen
TL;DR is: take the below as feedback to consider, and proceed as you see fit. Nobody's suggesting you can't do this. On Mon, Jul 29, 2019 at 2:58 AM Lars Francke wrote: > The way I read your point is that anyone can publish material (which includes > source code) under the ALv2 outside of the

Re: Apache Training contribution for Spark - Feedback welcome

2019-07-29 Thread Lars Francke
Happy to discuss this here but you're also invited to bring those points up at dev@training as other projects might have similar concerns. The request for assistance still stands. If anyone here is interested in helping out reviewing and improving the material please reach out. On Sat, Jul 27,

Re: Apache Training contribution for Spark - Feedback welcome

2019-07-26 Thread Sean Owen
On Fri, Jul 26, 2019 at 4:01 PM Lars Francke wrote: > I understand why it might be seen that way and we need to make sure to point > out that we have no intention of becoming "The official Apache Spark > training" because that's not our intention at all. Of course that's the intention; the

Re: Apache Training contribution for Spark - Feedback welcome

2019-07-26 Thread Lars Francke
panies have created Spark training courses. I wouldn't be surprised if it goes into the hundreds. And everyone draws the same or very similar slides (what's an RDD, what's a DataFrame etc.) We hope to change that and this contribution can be a first start. We did some research around training and espe

Re: Apache Training contribution for Spark - Feedback welcome

2019-07-26 Thread Sean Owen
Generally speaking, I think we want to encourage more training and tutorial content out there, for sure, so, the more the merrier. My reservation here is that as an Apache project, it might appear to 'bless' one set of materials as authoritative over all the others out there. And there are

Apache Training contribution for Spark - Feedback welcome

2019-07-26 Thread Lars Francke
Hi Spark community, you may or may not have heard of a new-ish (February 2019) project at Apache: Apache Training (incubating). We aim to develop training material about various projects inside and outside the ASF: < http://training.apache.org/> One of our users wants to contribute material on

Re: Contribution help needed for sub-tasks of an umbrella JIRA - port *.sql tests to improve coverage of Python, Pandas, Scala UDF cases

2019-07-09 Thread Hyukjin Kwon
, Jul 9, 2019 at 6:17 AM Hyukjin Kwon wrote: > >> Hi all, >> >> I am currently targeting to improve Python, Pandas UDFs Scala UDF test >> cases by integrating our existing *.sql files at >> https://issues.apache.org/jira/browse/SPARK-27921 >> >> I w

Re: Contribution help needed for sub-tasks of an umbrella JIRA - port *.sql tests to improve coverage of Python, Pandas, Scala UDF cases

2019-07-09 Thread Stavros Kontopoulos
sues.apache.org/jira/browse/SPARK-27921 > > I would appreciate that anyone who's interested in Spark contribution > takes some sub-tasks. It's too many for me to do :-). I am doing one by one > for now. > > I wrote some guides about this umbrella JIRA specifically so if you're >

Contribution help needed for sub-tasks of an umbrella JIRA - port *.sql tests to improve coverage of Python, Pandas, Scala UDF cases

2019-07-08 Thread Hyukjin Kwon
Hi all, I am currently targeting to improve Python, Pandas UDFs Scala UDF test cases by integrating our existing *.sql files at https://issues.apache.org/jira/browse/SPARK-27921 I would appreciate that anyone who's interested in Spark contribution takes some sub-tasks. It's too many for me to do

Re: Contribution

2019-02-12 Thread Valeria Vasylieva
Hi Gabor, Ok, sure I will! Best regards, Valeria вт, 12 февр. 2019 г. в 17:00, Gabor Somogyi : > Hi Valeria, > > Welcome, ping me if you need review. > > BR, > G > > > On Tue, Feb 12, 2019 at 2:51 PM Valeria Vasylieva < > valeria.vasyli...@gmail.com> wrote: > >> Hi Gabor, >> >> Thank you for

Re: Contribution

2019-02-12 Thread Gabor Somogyi
Hi Valeria, Welcome, ping me if you need review. BR, G On Tue, Feb 12, 2019 at 2:51 PM Valeria Vasylieva < valeria.vasyli...@gmail.com> wrote: > Hi Gabor, > > Thank you for clarification! Will do it! > I am happy to join the community! > > Best Regards, > Valeria > > вт, 12 февр. 2019 г. в

Re: Contribution

2019-02-12 Thread Valeria Vasylieva
Hi Gabor, Thank you for clarification! Will do it! I am happy to join the community! Best Regards, Valeria вт, 12 февр. 2019 г. в 16:32, Gabor Somogyi : > Hi Valeria, > > Glad to hear you would like to contribute! It will be assigned to you when > you create a PR. > Before you create it please

Re: Contribution

2019-02-12 Thread Gabor Somogyi
Hi Valeria, Glad to hear you would like to contribute! It will be assigned to you when you create a PR. Before you create it please read the following guide which describe the details: https://spark.apache.org/contributing.html BR, G On Tue, Feb 12, 2019 at 2:28 PM Valeria Vasylieva <

Contribution

2019-02-12 Thread Valeria Vasylieva
Hi! My name is Valeria Vasylieva and I would like to help with the task: https://issues.apache.org/jira/browse/SPARK-20597 Please assign it to me, my JIRA account is: nimfadora ( https://issues.apache.org/jira/secure/ViewProfile.jspa?name=nimfadora) Thank you!

Re: New to dev community | Contribution to Mlib

2017-09-22 Thread Driesprong, Fokko
Hi Venna, Sounds like a very interesting algorithm. I have to agree with Seth, in the end you don't want to add a lot of algorithms to Spark itself, it will blow up the codebase and in the end the tests will run forever. You can also consider publishing it to the Spark Packages website. I've also

Re: New to dev community | Contribution to Mlib

2017-09-21 Thread Venali Sonone
Thank you for your response. The algorithm that I am proposing is Isolation Forest. Link to paper: paper . I particularly find that it should be included in Spark ML because so many applications that use Spark as part of real time

Re: New to dev community | Contribution to Mlib

2017-09-20 Thread Seth Hendrickson
I'm not exactly clear on what you're proposing, but this sounds like something that would live as a Spark package - a framework for anomaly detection built on Spark. If there is some specific algorithm you have in mind, it would be good to propose it on JIRA and discuss why you think it needs to

New to dev community | Contribution to Mlib

2017-09-14 Thread Venali Sonone
Hello, I am new to dev community of Spark and also open source in general but have used Spark extensively. I want to create a complete part on anomaly detection in spark Mlib, For the same I want to know if someone could guide me so i can start the development and contribute to Spark Mlib. Sorry

New to dev community | Contribution to Mlib

2017-09-13 Thread Venali Sonone
Hello, I am new to dev community of Spark and also open source in general but have used Spark extensively. I want to create a complete part on anomaly detection in spark Mlib, For the same I want to know if someone could guide me so i can start the development and contribute to Spark Mlib. Sorry

Re: Apache Spark Contribution

2017-02-03 Thread Steve Loughran
You might want to look at Nephele: Efficient Parallel Data Processing in the Cloud, Warneke & Kao, 2009 http://stratosphere.eu/assets/papers/Nephele_09.pdf This was some of the work done in the research project with gave birth to Flink, though this bit didn't surface as they chose to leave VM

Apache Spark Contribution

2017-02-02 Thread Gabi Cristache
Hello, My name is Gabriel Cristache and I am a student in my final year of a Computer Engineering/Science University. I want for my Bachelor Thesis to add support for dynamic scaling to a spark streaming application. *The goal of the project is to develop an algorithm that automatically scales

Contribution to Apache Spark

2016-09-03 Thread aditya1702
Hello, I am Aditya Vyas and I am currently in my third year of college doing BTech in my engineering. I know python, a little bit of Java. I want to start contribution in Apache Spark. This is my first time in the field of Big Data. Can someone please help me as to how to get started. Which

Re: Possible contribution to MLlib

2016-06-21 Thread Jeff Zhang
I think it is valuable to make the distance function pluggable and also provide some builtin distance function. This might be also useful for other algorithms besides KMeans. On Tue, Jun 21, 2016 at 7:48 PM, Simon NANTY wrote: > Hi all, > > > > In my team, we are

Possible contribution to MLlib

2016-06-21 Thread Simon NANTY
Hi all, In my team, we are currently developing a fork of spark MLlib extending K-means method such that it is possible to set its own distance function. In this implementation, it could be possible to directly pass, in argument of the K-means train function, a distance function whose

Re: Unchecked contribution (JIRA and PR)

2015-11-26 Thread Sergio Ramírez
OK, I'll do that. Thanks for the response. El 17/11/15 a las 01:36, Joseph Bradley escribió: Hi Sergio, Apart from apologies about limited review bandwidth (from me too!), I wanted to add: It would be interesting to hear what feedback you've gotten from users of your package. Perhaps you

Re: Unchecked contribution (JIRA and PR)

2015-11-16 Thread Joseph Bradley
Hi Sergio, Apart from apologies about limited review bandwidth (from me too!), I wanted to add: It would be interesting to hear what feedback you've gotten from users of your package. Perhaps you could collect feedback by (a) emailing the user list and (b) adding a note in the Spark Packages

Re: Unchecked contribution (JIRA and PR)

2015-11-03 Thread Jerry Lam
Sergio, you are not alone for sure. Check the RowSimilarity implementation [SPARK-4823]. It has been there for 6 months. It is very likely those which don't merge in the version of spark that it was developed will never merged because spark changes quite significantly from version to version if

MLlib Contribution

2015-10-15 Thread Kybe67
and for the amazing Spark project. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Contribution-tp14626.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Re: Contribution

2015-06-14 Thread Joseph Bradley
of neural network in the apache spark -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contribution-tp12739.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

RE: Contribution

2015-06-13 Thread Eron Wright
Date: Fri, 12 Jun 2015 20:16:33 -0700 From: sreenivas.raghav...@gmail.com To: dev@spark.apache.org Subject: Contribution Hi everyone, I am interest to contribute new algorithms and optimize existing algorithms in the area of graph algorithms and machine learning. Please give

Re: Contribution

2015-06-13 Thread Akhil Das
-developers-list.1001551.n3.nabble.com/Contribution-tp12739.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e

Contribution

2015-06-12 Thread srinivasraghavansr71
this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contribution-tp12739.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr

Re: [jira] [Commented] (SPARK-6889) Streamline contribution process with update to Contribution wiki, JIRA rules

2015-04-14 Thread Imran Rashid
-- From: Nicholas Chammas (JIRA) j...@apache.org Date: Tue, Apr 14, 2015 at 3:38 PM Subject: [jira] [Commented] (SPARK-6889) Streamline contribution process with update to Contribution wiki, JIRA rules To: iss...@spark.apache.org Nicholas Chammas commented on SPARK-6889

Re: Contribution in java

2014-12-20 Thread Koert Kuipers
yes it does. although the core of spark is written in scala it also maintains java and python apis, and there is plenty of work for those to contribute to. On Sat, Dec 20, 2014 at 7:30 AM, sreenivas putta putta.sreeni...@gmail.com wrote: Hi, I want to contribute for spark in java. Does it

Re: Contribution in java

2014-12-20 Thread vaquar khan
Hi Sreenivas, Please read Spark doc first, everything mention in doc , without reading doc how can you contribute ? regards, vaquar khan On Sat, Dec 20, 2014 at 6:00 PM, sreenivas putta putta.sreeni...@gmail.com wrote: Hi, I want to contribute for spark in java. Does it support java? please

Re: Spark Contribution

2014-08-23 Thread Nicholas Chammas
the contribution guidelines: https://github.com/blog/1184-contributing-guidelines This is mildly important as the project wants to make it clear that you agree that your contribution is licensed under the AL2, since there is no formal ICLA. How about I propose moving the text to CONTRIBUTING.md

Re: Spark Contribution

2014-08-22 Thread Reynold Xin
Great idea. Added the link https://github.com/apache/spark/blob/master/README.md On Thu, Aug 21, 2014 at 4:06 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: We should add this link to the readme on GitHub btw. 2014년 8월 21일 목요일, Henry Saputrahenry.sapu...@gmail.com님이 작성한 메시지: The

Re: Spark Contribution

2014-08-22 Thread Maisnam Ns
Thanks all, for adding this link . On Sat, Aug 23, 2014 at 5:38 AM, Reynold Xin r...@databricks.com wrote: Great idea. Added the link https://github.com/apache/spark/blob/master/README.md On Thu, Aug 21, 2014 at 4:06 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: We should add

Spark Contribution

2014-08-21 Thread Maisnam Ns
Hi, Can someone help me with some links on how to contribute for Spark Regards mns

Re: Spark Contribution

2014-08-21 Thread Henry Saputra
The Apache Spark wiki on how to contribute should be great place to start: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark - Henry On Thu, Aug 21, 2014 at 3:25 AM, Maisnam Ns maisnam...@gmail.com wrote: Hi, Can someone help me with some links on how to contribute for

Re: Spark Contribution

2014-08-21 Thread Nicholas Chammas
We should add this link to the readme on GitHub btw. 2014년 8월 21일 목요일, Henry Saputrahenry.sapu...@gmail.com님이 작성한 메시지: The Apache Spark wiki on how to contribute should be great place to start: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark - Henry On Thu, Aug 21,

Contribution to MLlib

2014-07-09 Thread MEETHU MATHEW
Hi, I am interested in contributing a clustering algorithm towards MLlib of Spark.I am focusing on Gaussian Mixture Model. But I saw a JIRA @ https://spark-project.atlassian.net/browse/SPARK-952 regrading the same.I would like to know whether Gaussian Mixture Model is  already implemented or

Re: Contribution to MLlib

2014-07-09 Thread RJ Nowling
Hi Meethu, There is no code for a Gaussian Mixture Model clustering algorithm in the repository, but I don't know if anyone is working on it. RJ On Wednesday, July 9, 2014, MEETHU MATHEW meethu2...@yahoo.co.in wrote: Hi, I am interested in contributing a clustering algorithm towards MLlib

Re: Contribution to MLlib

2014-07-09 Thread Xiangrui Meng
I don't know if anyone is working on it either. If that JIRA is not moved to Apache JIRA, feel free to create a new one and make a note that you are working on it. Thanks! -Xiangrui On Wed, Jul 9, 2014 at 4:56 AM, RJ Nowling rnowl...@gmail.com wrote: Hi Meethu, There is no code for a Gaussian