Hi Kevin and Spam Assassin Dev Community, I have made some changes in the draft. GSoC 2018 Proposal <https://docs.google.com/document/d/1-OCNv79sHvVViKwnrRYtlMiKWLCzz4xUW4tNOlmaTmw/edit?usp=sharing>
I request you all to rigorously review it and suggest appropriate edits. As, this is the final phase of the application period(Deadline 27th March 16:00 UTC), I would really appreciate it If you respond before this. This will help me in incorporating the suggested changes in time. Thanks... Saahil Sirowa B. Tech Computer Science and Engineering Indian Institute of Technology, Hyderabad On Fri, Mar 23, 2018 at 7:55 PM, Saahil Sirowa <cs16btech11...@iith.ac.in> wrote: > I had some in last 2-3 days. I will update the proposal draft with > required changes by tomorrow night(Sat night). > > Thanks... > Saahil Sirowa > B. Tech Computer Science and Engineering > Indi@n Institute of Technology, Hyderabad > > On Fri 23 Mar, 2018, 18:01 Kevin A. McGrail, <kmcgr...@apache.org> wrote: > >> Wanted to check in and see how you are doing. THis blog post has gotten >> some praise >> >> https://medium.com/@owtf/google-summer-of-code-writing- >> a-good-proposal-141b1376f076. >> >> -- >> Kevin A. McGrail >> Asst. Treasurer & VP Fundraising, Apache Software Foundation >> Chair Emeritus Apache SpamAssassin Project >> https://www.linkedin.com/in/kmcgrail - 703.798.0171 >> >> On Wed, Mar 21, 2018 at 7:52 AM, Kevin A. McGrail <kmcgr...@apache.org> >> wrote: >> >>> Comments allowed might be helpful though :-) >>> >>> -- >>> Kevin A. McGrail >>> Asst. Treasurer & VP Fundraising, Apache Software Foundation >>> Chair Emeritus Apache SpamAssassin Project >>> https://www.linkedin.com/in/kmcgrail - 703.798.0171 <(703)%20798-0171> >>> >>> On Wed, Mar 21, 2018 at 12:36 AM, Rajkiran Rajkumar < >>> rajkiran2...@gmail.com> wrote: >>> >>>> @Saahil, kindly make your doc view-only for people with a link to it. >>>> Giving edit permissions to the world is a bad idea. >>>> >>>> Thanks, >>>> Rajkiran >>>> >>>> On Tue, Mar 20, 2018 at 5:17 PM, Kevin A. McGrail <kmcgr...@apache.org> >>>> wrote: >>>> >>>>> +users >>>>> >>>>> All we give is feedback. The submission to GSoC is what matters. So >>>>> if you mentioned perl here that's not going to carryover to the reviewers. >>>>> >>>>> Can someone with fresh eyes take a look at this? I read it too >>>>> recently so I will gloss over it too much. >>>>> >>>>> Here are some posts the mentors list thought might be helpful. The >>>>> first I believe covers someone's pov who did not get selected. >>>>> >>>>> https://medium.freecodecamp.org/hacking-gsoc-how-to-gain- >>>>> real-life-experience-and-support-open-source- >>>>> b1e6a664f6e4?source=linkShare-53ba2bb84284-1521381334 >>>>> >>>>> https://sanatt.me/2017/12/30/cracking-google-summer-code-2018/ >>>>> >>>>> Regards, KAM >>>>> >>>>> On Tue, Mar 20, 2018, 03:57 Saahil Sirowa <cs16btech11...@iith.ac.in> >>>>> wrote: >>>>> >>>>>> Hi Kevin and Apache SpamAssassin Dev Community, >>>>>> >>>>>> I have resolved all the changes you suggested in the previous draft. >>>>>> 1) I mentioned about learning PERL a week before the community >>>>>> bonding period. It will not take much time. I can assure you that >>>>>> language >>>>>> is not going to be an issue. >>>>>> 2) I updated the biography part a bit >>>>>> 3) Significant changes have been made in the Timeline. >>>>>> 4) I'm planning to used cmake/travis ci for automated testing. If >>>>>> there is a better alternative please do suggest. >>>>>> 5) I gave links to research papers that i will be reading in the >>>>>> timeline. >>>>>> 6) I updated the timeline by mentioning to gain advanced information >>>>>> about email traffic and spams. I listed some links for the purpose. >>>>>> 7) I updated the credits >>>>>> 8) There are other changes made in various parts of proposal. >>>>>> >>>>>> Thanks for your previous detailed feedback. >>>>>> >>>>>> Here is link to the updated proposal >>>>>> GSoC 2018 proposal >>>>>> <https://docs.google.com/document/d/1-OCNv79sHvVViKwnrRYtlMiKWLCzz4xUW4tNOlmaTmw/edit#heading=h.q7h3lddabdvh> >>>>>> Please rigorously review it and suggest any changes that I should >>>>>> make. >>>>>> >>>>>> Awaiting for a favorable response. >>>>>> >>>>>> >>>>>> Thanks... >>>>>> Saahil Sirowa >>>>>> B. Tech Computer Science and Engineering >>>>>> Indian Institute of Technology, Hyderabd >>>>>> >>>>>> On Mon, Mar 19, 2018 at 3:27 AM, Kevin A. McGrail < >>>>>> kmcgr...@apache.org> wrote: >>>>>> >>>>>>> Hi Saahil >>>>>>> >>>>>>> re: Perl. As the project is primarily in Perl and you do not list >>>>>>> that in your Proficiencies or any similar languages like PHP, I would >>>>>>> address that. The word Perl does not appear a single time. >>>>>>> >>>>>>> Your Biography is a little light on why this is something you feel >>>>>>> you can implement. The mentors will likely NOT be able to help you with >>>>>>> the science rather focusing on the community, processes, and open >>>>>>> source in >>>>>>> general. >>>>>>> >>>>>>> re: Email and SPam, do you have any experience with email traffic or >>>>>>> spam? if so, add it. If not, explain what you plan to do to address >>>>>>> that. >>>>>>> >>>>>>> Re: Deliverables, I think you'll need to propose the first draft of >>>>>>> that. But your goal will likely be a plugin for Apache SpamAssassin >>>>>>> that >>>>>>> can be installed and configured to provide multiple configurable >>>>>>> statistical analysis algorithms to better identify ham (good email) >>>>>>> and/or >>>>>>> spam (bad email) >>>>>>> >>>>>>> Please use Apache SpamAssassin to properly brand the title. >>>>>>> >>>>>>> Re: I have no input on the scheduling/timelines except that past >>>>>>> proposal I have read have included more phases and do not add "optional" >>>>>>> items. I'd prefer to see small increments to make sure you stay on >>>>>>> schedule and don't get overwhelmed and find yourself way behind as the >>>>>>> time >>>>>>> progresses. >>>>>>> >>>>>>> Re: Testing Methodology, this is likely the most critical missing >>>>>>> part. I am a fan of test driven development where you set up tests that >>>>>>> should pass and fall and use continuous testing as you add code to >>>>>>> confirm >>>>>>> your development is progressing well. >>>>>>> >>>>>>> This is especially important because spam analysis often doesn't >>>>>>> work the way people expect and tests w/statistics can help identify >>>>>>> issues. >>>>>>> >>>>>>> For example, this is a hypothesis that this statistical algorithms >>>>>>> will be better than Bayes. So you'll need a baseline for comparison. >>>>>>> >>>>>>> Additionally, even experts in the field are surprised when they >>>>>>> think something will prove the hamminess of an email but in fact shows >>>>>>> the >>>>>>> opposite. Real world example, SPF is a policy when introduced was >>>>>>> supposed >>>>>>> to allow an automated mechanism that says "this is an email from a >>>>>>> legitimate mail server for my domain". >>>>>>> >>>>>>> However, the FIRST wave of people to adobt it were all spammers. So >>>>>>> it became a spam indicator more than a spam indicator. It was a very >>>>>>> interesting outcome. >>>>>>> >>>>>>> Re: Corpora, you'll want a corpora of carefully hand sorted ham and >>>>>>> spam. Have you thought about how you'll get that? I *might* be able to >>>>>>> help but it's 50/50. >>>>>>> >>>>>>> Re: You mention reading research papers on statisical algorithms >>>>>>> from a previous proposal. You'll want to list them to show which ones >>>>>>> you >>>>>>> plan to study >>>>>>> >>>>>>> re: "Discussions with the SA community regarding the various types >>>>>>> of spams that the present SA can handle." is unclear. What is a "type >>>>>>> of >>>>>>> spam" to you? Do you have a list of types of spam? >>>>>>> >>>>>>> re: "Brainstorming with the mentors and SA community about the >>>>>>> various input features and parameters that can have a huge impact on the >>>>>>> overall performance of the listed neural nets models." I think this is >>>>>>> flawed. There won't be a ton of people who can discuss this with you. >>>>>>> You'll need to likely use scientific process to show what has a >>>>>>> performance >>>>>>> impact. This is not busy work or school work. This is an experiment >>>>>>> that >>>>>>> has not been tried at the SA project. >>>>>>> >>>>>>> re: "actively involved with the community." is a stretch. A few >>>>>>> emails do not active involvement make. >>>>>>> >>>>>>> re: Bonding, you might consider raising that to 1-2 major bugs and >>>>>>> 10-20 minor bugs. >>>>>>> >>>>>>> Re: Credits/references, I would add more clarity about where each of >>>>>>> those references are used. >>>>>>> >>>>>>> Regards, >>>>>>> KAM >>>>>>> >>>>>> >>>>>> >>>> >>> >>