@Saikat- why use EL instead of Lucene directly.
> On Apr 28, 2016, at 12:08 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote: > > This is great information thank you, based on this recommendation I won't > create a JIRA but start work on my project and when the code approaches the > percentages you are describing I will create the appropriate JIRA's and put > together a proposal to send to the list, sound ok? Based on your latest > updates to the wiki i will work on a handful of the clustering algorithms > since I see that the Spark implementations for these are not yet complete. > Thank you again > >> From: ap....@outlook.com >> To: dev@mahout.apache.org >> Subject: Re: Mahout contributions >> Date: Thu, 28 Apr 2016 01:31:09 +0000 >> >> Saikat, >> >> One other thing that I should say is that you do not need clearance or input >> from the committers to begin work on your project, and the interest can and >> should come from the community as a whole. You can write proposal as you've >> done, and if you don't see any "+1"s or responses from the community at >> whole with in a few days, you may want to explain in more detail, give >> examples and use cases. If you are still not seeing +1s or any responses >> from others then I think you can assume that there may not be interest; this >> is usually how things work. >> >> However if its something that your passionate about and you feel like you >> can deliver this should not to stop you. People do not always read the dev@ >> emails or have time to respond. You can still move forward with your >> proposed contribution by following the steps laid out in my previous email; >> follow the protocol at: >> >> http://mahout.apache.org/developers/how-to-contribute.html >> >> and create a JIRA. When you have reached a significant amount of completion >> (around 70-80%), open a PR for review, this way you can explain in more >> detail. >> >> But please realize that when you open a JIRA for a new issue there is some >> expectation of a commitment on your part to complete it. >> >> For example, I am currently investigating some new plotting features. I >> have spent a good deal of time this week and last already and am even >> mocking up code as a sketch of what may become an implementation before I >> open a "New Feature" JIRA for it. >> >> My point is absolutely not to discourage you or anybody else from opening >> JIRAs for new features, rather to let you know that when you open an JIRA >> for a new issue, It tells others that your are working on it, and thus may >> discourage another with a similar idea to contribute this feature. So it is >> best to open it once you've begun your work and are committed to it. >> >> Andy >> >> ________________________________________ >> From: Saikat Kanjilal <sxk1...@hotmail.com> >> Sent: Wednesday, April 27, 2016 8:24 PM >> To: dev@mahout.apache.org >> Subject: RE: Mahout contributions >> >> Andrew,Thank you very much for your input, I actually want to start a new >> set of JIRAs, here's what I want to work on, I want to build a framework >> that ties together search/visualization capability with some machine >> learning algorithms, so essentially think of it as tying in elasticsearch >> and kibana into mahout , the user can search for their data with >> elasticsearch and for deeper analysis on that data they can feed that data >> into one or more mahout backends for analysis. Another interesting tie in >> might be to hack kibana to render ggplot like graphics based on the output >> of mahout algorithms (assuming this can be a kibana plugin). >> Before I go hog wild to create a bunch of JIRA's I'd like to know if there's >> interest in this initiative. The tool will bring together the ELK stack >> with dynamic machine learning algorithms. I can go into a lot more detail >> around use cases if there's enough interest. >> Looking forward to your and other committers input.Thanks >> >>> From: ap....@outlook.com >>> To: dev@mahout.apache.org >>> Subject: Re: Mahout contributions >>> Date: Wed, 27 Apr 2016 20:16:38 +0000 >>> >>> Hello Saikat, >>> >>> #1 and #2 above are already implemented. #4 is tricky so i would not >>> recommend without a strong knowledge of the codebase, and #5 is now >>> deprecated. (I've just updated the algorithms grid to reflect this). The >>> algorithms page includes both algorithms implemented in the math-scala >>> library and algorithms which have CLI drivers written for them. >>> >>> Please see: http://mahout.apache.org/developers/how-to-contribute.html >>> >>> And please note that per that documentation, it is in everybody's best >>> interest to keep messages on list, contacting committers directly is >>> discouraged. >>> >>> The best way to contribute (if you have not found a new bug or issue) would >>> be for you to pick a single open issue in the mahout JIRA which is not >>> already assigned, and start work on it. When your work is ready for >>> review, just open up a PR and the committers will review it. Please note >>> that if you do pick up an issue to work on, we do expect some amount of >>> responsibility and reliability and tangible amount of satisfactory work >>> since once you've marked a JIRA as something you're working on, others will >>> pass on it. >>> >>> Another good way to contribute would be to look for enhancements that could >>> make to existing code not necessarily open JIRAs that need to be assigned >>> to you. For example please see the recent contribution and workflow on: >>> https://issues.apache.org/jira/browse/MAHOUT-1833 . >>> >>> If you have something new that you'd like to implement, simply start a new >>> JIRA issue and begin work on it. In this case, when you have some code >>> that is ready for review, you can simply open up a PR for it and >>> committers will review it. For new implementations, we generally say that >>> you should do this when you are at least 70-80% finished with your coding. >>> >>> Thank You, >>> >>> Andy >>> >>> >>> >>> ________________________________________ >>> From: Saikat Kanjilal <sxk1...@hotmail.com> >>> Sent: Tuesday, April 26, 2016 7:17 PM >>> To: dev@mahout.apache.org >>> Subject: RE: Mahout contributions >>> >>> Hello,Following up on my last email with more specifics, I've looked >>> through the wiki (https://mahout.apache.org/users/basics/algorithms.html) >>> and I'm interested in implementing the one or more of the following >>> algorithms with Mahout using spark: 1) Matrix Factorization with ALS 2) >>> Naive Bayes 3) Weighted Matrix Factorization, SVD++ 4) Sparse TF-IDF >>> Vectors from Text 5) Lucene integration. >>> Had a few questions:1) Which of these should I start with and where is >>> there the greatest need?2) Should I fork the repo and create branches for >>> the each of the above implementations?3) Should I go ahead and create some >>> JIRAs for these? >>> Would love to have some pointers to get started?Regards >>> >>> From: sxk1...@hotmail.com >>> To: dev@mahout.apache.org >>> Subject: Mahout contributions >>> Date: Wed, 30 Mar 2016 10:23:45 -0700 >>> >>> >>> >>> >>> Hello Committers,I was looking through the current jira tickets and was >>> wondering if there's a particular area of Mahout that needs some more help >>> than others, should I focus on contributing some algorithms usign DSL or >>> Samsara related efforts, I've finally got some bandwidth to do some work >>> and would love some guidance before assigning myself some tickets.Regards >