Re: Website, urgent help needed
Hi Scott, Create a jira ticket and attach your scripts and a text version of the page there. Best, Sebastian On 03/12/2014 03:27 PM, Scott C. Cote wrote: I took the tour of the text analysis and pushed through despite the problems on the page. Commiters helped me over the hump where others might have just gave up (to your point). When I did it, I made shell scripts so that my steps would be repeatable with an anticipation of updating the page. Unforunately, I gave up on trying to figure out how to update the page (there were links indicating that I could do it), and I didn¹t want to appear to be stupid asking how to update the documentation (my bad - not anyone else). Now I know that it was not possible unless I was a commiter. Who should I send my scripts to, or how should I proceed with a current form of the page? SCott On 3/12/14, 5:02 AM, Sebastian Schelter s...@apache.org wrote: Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
I have created issue https://issues.apache.org/jira/browse/MAHOUT-1461 Will upload shell scripts and suggested replacement text later tonight …. SCott On 3/13/14, 10:43 AM, Sebastian Schelter s...@apache.org wrote: Hi Scott, Create a jira ticket and attach your scripts and a text version of the page there. Best, Sebastian On 03/12/2014 03:27 PM, Scott C. Cote wrote: I took the tour of the text analysis and pushed through despite the problems on the page. Commiters helped me over the hump where others might have just gave up (to your point). When I did it, I made shell scripts so that my steps would be repeatable with an anticipation of updating the page. Unforunately, I gave up on trying to figure out how to update the page (there were links indicating that I could do it), and I didn¹t want to appear to be stupid asking how to update the documentation (my bad - not anyone else). Now I know that it was not possible unless I was a commiter. Who should I send my scripts to, or how should I proceed with a current form of the page? SCott On 3/12/14, 5:02 AM, Sebastian Schelter s...@apache.org wrote: Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Website, urgent help needed
Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Hi Sebastian, I am afraid I am only familiar with the recommendation part. In previous posts, I pointed a couple of errors in this wiki page: https://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+analysis+using+the+Mahout+command+line If you are planning to keep it in the new web, I can help pointing them out again. Thanks a lot for your effort. On Wed, Mar 12, 2014 at 7:03 AM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
hi. just read the whole email just now as earlier i was travelling. i am on it. On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no simple way for a beginner to know what kind of format any one of the algorithms expects and what it outputs exactly, how to chain processes etc... They might go as far as reading the javadoc (although not everyone does that) but they won't all, as I had to and did, download the sources and try making sense of them to get the information. Hopefully the mailing list is particularly active and one can find the answer if he has time and will to search them and ask kindly, which is a very strong strength of Mahout, but the average beginner, wanting to just try the library can't and won't do that. I'm willing to document the parts of the code I used and began to understand, however I've been facing difficulties to set up the maven project in eclipse for now. Also since I'm a Belgian, English is not my mother tongue so I'm almost certain to make mistakes, but I think it would take less time to you to correct these few English mistakes than to write the documentation :) I'll go ahead and try to set thing up with Eclipse and if I don't succeed I'll write a mail on the dev list for help in that matter. I also can, if I find the time, continue my efforts of reporting bugs and not working or accurate links and descriptions on the website, if need be and update my JIRA entry accordingly. Kévin Moulart 2014-03-12 8:48 GMT+01:00 Pavan Kumar N pavan.naraya...@gmail.com: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
We don't exactly have that page, but we have pages that touch parts of it, such as https://mahout.apache.org/users/basics/creating-vectors-from-text.html It would be great if you could create a jira ticket which lists the errors. I'll fix them then. Best, Sebastian On 03/12/2014 08:42 AM, Juan José Ramos wrote: Hi Sebastian, I am afraid I am only familiar with the recommendation part. In previous posts, I pointed a couple of errors in this wiki page: https://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+analysis+using+the+Mahout+command+line If you are planning to keep it in the new web, I can help pointing them out again. Thanks a lot for your effort. On Wed, Mar 12, 2014 at 7:03 AM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Hi Manoj, Awesome that you're willing to help. I suggest we proceed analogously to the clustering cleanup: The documentation are the pages listed under Classification in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the Naive Bayes doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for Naive Bayes on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Best, Sebastian On 03/12/2014 09:05 AM, Manoj Awasthi wrote: Thanks Sebastian to you and others for effort in cleaning up the website interface. It looks much better (fonts layout) and much more usable if I may say. I will be happy to volunteer for the pages under classification in whatever ways I can. I would want to contribute specially on verifying that the examples provided work in the form they exist on the website and will be happy to do any corrections wherever possible. If there is initial backlog list which provides tasks at a granular level then it will be great OR I can start looking on the page myself. Manoj On Wed, Mar 12, 2014 at 12:33 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Hi Kevin, go to eclipse market place and install m2eclipse . after you do a mvn install on your mahout, import the compiled mahout. I ll try to write detailed documentation with screenshots but for the moment use the above as starting point. On 12 March 2014 15:29, Sebastian Schelter s...@apache.org wrote: We don't exactly have that page, but we have pages that touch parts of it, such as https://mahout.apache.org/users/basics/creating-vectors- from-text.html It would be great if you could create a jira ticket which lists the errors. I'll fix them then. Best, Sebastian On 03/12/2014 08:42 AM, Juan José Ramos wrote: Hi Sebastian, I am afraid I am only familiar with the recommendation part. In previous posts, I pointed a couple of errors in this wiki page: https://cwiki.apache.org/confluence/display/MAHOUT/ Quick+tour+of+text+analysis+using+the+Mahout+command+line If you are planning to keep it in the new web, I can help pointing them out again. Thanks a lot for your effort. On Wed, Mar 12, 2014 at 7:03 AM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Hi Kevin, Thank you for offer to help! Feel free to ask questions here how to setup the sources in Eclipse. If you succeed, you could writeup what you did and we could add this to the website, as I'm sure a lot of others will have the same problem. It would be great if you could start improving the javadoc, its totally fine if your english is not perfect, we can always ask a native speaker to read over it. If you start working on the javadoc, please create a jira issue for that work before you start. Best, Sebastian On 03/12/2014 09:30 AM, Kevin Moulart wrote: I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no simple way for a beginner to know what kind of format any one of the algorithms expects and what it outputs exactly, how to chain processes etc... They might go as far as reading the javadoc (although not everyone does that) but they won't all, as I had to and did, download the sources and try making sense of them to get the information. Hopefully the mailing list is particularly active and one can find the answer if he has time and will to search them and ask kindly, which is a very strong strength of Mahout, but the average beginner, wanting to just try the library can't and won't do that. I'm willing to document the parts of the code I used and began to understand, however I've been facing difficulties to set up the maven project in eclipse for now. Also since I'm a Belgian, English is not my mother tongue so I'm almost certain to make mistakes, but I think it would take less time to you to correct these few English mistakes than to write the documentation :) I'll go ahead and try to set thing up with Eclipse and if I don't succeed I'll write a mail on the dev list for help in that matter. I also can, if I find the time, continue my efforts of reporting bugs and not working or accurate links and descriptions on the website, if need be and update my JIRA entry accordingly. Kévin Moulart 2014-03-12 8:48 GMT+01:00 Pavan Kumar N pavan.naraya...@gmail.com: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Hi All, I would also like to participate in cleaning up the documentation. Since, I am fairly new to the Mahout infrastructure. It will in-turn help me understand things better. Do we already have a Jira ticket for organizing the cleaning up of documentation ? Just want to be sure, that I am not stepping on pages some else has already updated. Thanks Regards, Pramit On Wed, Mar 12, 2014 at 3:07 AM, Sebastian Schelter s...@apache.org wrote: Hi Kevin, Thank you for offer to help! Feel free to ask questions here how to setup the sources in Eclipse. If you succeed, you could writeup what you did and we could add this to the website, as I'm sure a lot of others will have the same problem. It would be great if you could start improving the javadoc, its totally fine if your english is not perfect, we can always ask a native speaker to read over it. If you start working on the javadoc, please create a jira issue for that work before you start. Best, Sebastian On 03/12/2014 09:30 AM, Kevin Moulart wrote: I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no simple way for a beginner to know what kind of format any one of the algorithms expects and what it outputs exactly, how to chain processes etc... They might go as far as reading the javadoc (although not everyone does that) but they won't all, as I had to and did, download the sources and try making sense of them to get the information. Hopefully the mailing list is particularly active and one can find the answer if he has time and will to search them and ask kindly, which is a very strong strength of Mahout, but the average beginner, wanting to just try the library can't and won't do that. I'm willing to document the parts of the code I used and began to understand, however I've been facing difficulties to set up the maven project in eclipse for now. Also since I'm a Belgian, English is not my mother tongue so I'm almost certain to make mistakes, but I think it would take less time to you to correct these few English mistakes than to write the documentation :) I'll go ahead and try to set thing up with Eclipse and if I don't succeed I'll write a mail on the dev list for help in that matter. I also can, if I find the time, continue my efforts of reporting bugs and not working or accurate links and descriptions on the website, if need be and update my JIRA entry accordingly. Kévin Moulart 2014-03-12 8:48 GMT+01:00 Pavan Kumar N pavan.naraya...@gmail.com: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation. -- Pramit Choudhary 949.864.9717 www.linkedin.com/in/pramitc/
Re: Website, urgent help needed
Here you can see all issues (resolved and unresolved) for the next release: https://issues.apache.org/jira/browse/MAHOUT-1413?jql=project%20%3D%20MAHOUT%20AND%20fixVersion%20%3D%201.0%20ORDER%20BY%20priority%20DESC When you start to work on the cleanup of a page, make sure that there is no ticket existing for that. If it isnt, create a jira ticket with the name of the page in the title. --sebastian On 03/12/2014 11:20 AM, pramit choudhary wrote: Hi All, I would also like to participate in cleaning up the documentation. Since, I am fairly new to the Mahout infrastructure. It will in-turn help me understand things better. Do we already have a Jira ticket for organizing the cleaning up of documentation ? Just want to be sure, that I am not stepping on pages some else has already updated. Thanks Regards, Pramit On Wed, Mar 12, 2014 at 3:07 AM, Sebastian Schelter s...@apache.org wrote: Hi Kevin, Thank you for offer to help! Feel free to ask questions here how to setup the sources in Eclipse. If you succeed, you could writeup what you did and we could add this to the website, as I'm sure a lot of others will have the same problem. It would be great if you could start improving the javadoc, its totally fine if your english is not perfect, we can always ask a native speaker to read over it. If you start working on the javadoc, please create a jira issue for that work before you start. Best, Sebastian On 03/12/2014 09:30 AM, Kevin Moulart wrote: I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no simple way for a beginner to know what kind of format any one of the algorithms expects and what it outputs exactly, how to chain processes etc... They might go as far as reading the javadoc (although not everyone does that) but they won't all, as I had to and did, download the sources and try making sense of them to get the information. Hopefully the mailing list is particularly active and one can find the answer if he has time and will to search them and ask kindly, which is a very strong strength of Mahout, but the average beginner, wanting to just try the library can't and won't do that. I'm willing to document the parts of the code I used and began to understand, however I've been facing difficulties to set up the maven project in eclipse for now. Also since I'm a Belgian, English is not my mother tongue so I'm almost certain to make mistakes, but I think it would take less time to you to correct these few English mistakes than to write the documentation :) I'll go ahead and try to set thing up with Eclipse and if I don't succeed I'll write a mail on the dev list for help in that matter. I also can, if I find the time, continue my efforts of reporting bugs and not working or accurate links and descriptions on the website, if need be and update my JIRA entry accordingly. Kévin Moulart 2014-03-12 8:48 GMT+01:00 Pavan Kumar N pavan.naraya...@gmail.com: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Thanks, I'll do that partly on my free time since I'm working on other things at work right now :) Kévin Moulart 2014-03-12 11:07 GMT+01:00 Sebastian Schelter s...@apache.org: Hi Kevin, Thank you for offer to help! Feel free to ask questions here how to setup the sources in Eclipse. If you succeed, you could writeup what you did and we could add this to the website, as I'm sure a lot of others will have the same problem. It would be great if you could start improving the javadoc, its totally fine if your english is not perfect, we can always ask a native speaker to read over it. If you start working on the javadoc, please create a jira issue for that work before you start. Best, Sebastian On 03/12/2014 09:30 AM, Kevin Moulart wrote: I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no simple way for a beginner to know what kind of format any one of the algorithms expects and what it outputs exactly, how to chain processes etc... They might go as far as reading the javadoc (although not everyone does that) but they won't all, as I had to and did, download the sources and try making sense of them to get the information. Hopefully the mailing list is particularly active and one can find the answer if he has time and will to search them and ask kindly, which is a very strong strength of Mahout, but the average beginner, wanting to just try the library can't and won't do that. I'm willing to document the parts of the code I used and began to understand, however I've been facing difficulties to set up the maven project in eclipse for now. Also since I'm a Belgian, English is not my mother tongue so I'm almost certain to make mistakes, but I think it would take less time to you to correct these few English mistakes than to write the documentation :) I'll go ahead and try to set thing up with Eclipse and if I don't succeed I'll write a mail on the dev list for help in that matter. I also can, if I find the time, continue my efforts of reporting bugs and not working or accurate links and descriptions on the website, if need be and update my JIRA entry accordingly. Kévin Moulart 2014-03-12 8:48 GMT+01:00 Pavan Kumar N pavan.naraya...@gmail.com: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Thanks Sebastian, that's a great help. #Pramit On Wed, Mar 12, 2014 at 3:37 AM, Kevin Moulart kevinmoul...@gmail.comwrote: Thanks, I'll do that partly on my free time since I'm working on other things at work right now :) Kévin Moulart 2014-03-12 11:07 GMT+01:00 Sebastian Schelter s...@apache.org: Hi Kevin, Thank you for offer to help! Feel free to ask questions here how to setup the sources in Eclipse. If you succeed, you could writeup what you did and we could add this to the website, as I'm sure a lot of others will have the same problem. It would be great if you could start improving the javadoc, its totally fine if your english is not perfect, we can always ask a native speaker to read over it. If you start working on the javadoc, please create a jira issue for that work before you start. Best, Sebastian On 03/12/2014 09:30 AM, Kevin Moulart wrote: I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no simple way for a beginner to know what kind of format any one of the algorithms expects and what it outputs exactly, how to chain processes etc... They might go as far as reading the javadoc (although not everyone does that) but they won't all, as I had to and did, download the sources and try making sense of them to get the information. Hopefully the mailing list is particularly active and one can find the answer if he has time and will to search them and ask kindly, which is a very strong strength of Mahout, but the average beginner, wanting to just try the library can't and won't do that. I'm willing to document the parts of the code I used and began to understand, however I've been facing difficulties to set up the maven project in eclipse for now. Also since I'm a Belgian, English is not my mother tongue so I'm almost certain to make mistakes, but I think it would take less time to you to correct these few English mistakes than to write the documentation :) I'll go ahead and try to set thing up with Eclipse and if I don't succeed I'll write a mail on the dev list for help in that matter. I also can, if I find the time, continue my efforts of reporting bugs and not working or accurate links and descriptions on the website, if need be and update my JIRA entry accordingly. Kévin Moulart 2014-03-12 8:48 GMT+01:00 Pavan Kumar N pavan.naraya...@gmail.com: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation. -- Pramit Choudhary 949.864.9717 www.linkedin.com/in/pramitc/
Re: Website, urgent help needed
I took the tour of the text analysis and pushed through despite the problems on the page. Commiters helped me over the hump where others might have just gave up (to your point). When I did it, I made shell scripts so that my steps would be repeatable with an anticipation of updating the page. Unforunately, I gave up on trying to figure out how to update the page (there were links indicating that I could do it), and I didn¹t want to appear to be stupid asking how to update the documentation (my bad - not anyone else). Now I know that it was not possible unless I was a commiter. Who should I send my scripts to, or how should I proceed with a current form of the page? SCott On 3/12/14, 5:02 AM, Sebastian Schelter s...@apache.org wrote: Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
I’ll make it work. Don’t know markdown (assume some reduced mark”up” language) - but I’ll figure it out. I will assume that I can check with my consulting buddy “Google” and find it. :) Thank you for your contributions - glad that I can give “something” back. I’ll start off by sending the doc to one of the committers, and then if you guys like my work, then we can proceed from there …. SCott On 3/12/14, 9:38 AM, Sebastian Schelter s...@apache.org wrote: Hi Scott, The cms behind the website uses markdown. So ideally you would attach a textfile with markdown formattings to a jira issue and a committer will put that into the website. Does that work for you? PS: There are a lot of online markdown editors out there. On 03/12/2014 03:27 PM, Scott C. Cote wrote: I took the tour of the text analysis and pushed through despite the problems on the page. Commiters helped me over the hump where others might have just gave up (to your point). When I did it, I made shell scripts so that my steps would be repeatable with an anticipation of updating the page. Unforunately, I gave up on trying to figure out how to update the page (there were links indicating that I could do it), and I didn¹t want to appear to be stupid asking how to update the documentation (my bad - not anyone else). Now I know that it was not possible unless I was a commiter. Who should I send my scripts to, or how should I proceed with a current form of the page? SCott On 3/12/14, 5:02 AM, Sebastian Schelter s...@apache.org wrote: Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
Thanks Scott; please just attach your work to an issue in the Jira system; if there's not one already you could file a new issue. On Mar 12, 2014, at 7:44 AM, Scott C. Cote scottcc...@gmail.com wrote: I’ll make it work. Don’t know markdown (assume some reduced mark”up” language) - but I’ll figure it out. I will assume that I can check with my consulting buddy “Google” and find it. :) Thank you for your contributions - glad that I can give “something” back. I’ll start off by sending the doc to one of the committers, and then if you guys like my work, then we can proceed from there …. SCott On 3/12/14, 9:38 AM, Sebastian Schelter s...@apache.org wrote: Hi Scott, The cms behind the website uses markdown. So ideally you would attach a textfile with markdown formattings to a jira issue and a committer will put that into the website. Does that work for you? PS: There are a lot of online markdown editors out there. On 03/12/2014 03:27 PM, Scott C. Cote wrote: I took the tour of the text analysis and pushed through despite the problems on the page. Commiters helped me over the hump where others might have just gave up (to your point). When I did it, I made shell scripts so that my steps would be repeatable with an anticipation of updating the page. Unforunately, I gave up on trying to figure out how to update the page (there were links indicating that I could do it), and I didn¹t want to appear to be stupid asking how to update the documentation (my bad - not anyone else). Now I know that it was not possible unless I was a commiter. Who should I send my scripts to, or how should I proceed with a current form of the page? SCott On 3/12/14, 5:02 AM, Sebastian Schelter s...@apache.org wrote: Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.
Re: Website, urgent help needed
ok On 3/12/14, 9:58 AM, Andrew Musselman andrew.mussel...@gmail.com wrote: Thanks Scott; please just attach your work to an issue in the Jira system; if there's not one already you could file a new issue. On Mar 12, 2014, at 7:44 AM, Scott C. Cote scottcc...@gmail.com wrote: I’ll make it work. Don’t know markdown (assume some reduced mark”up” language) - but I’ll figure it out. I will assume that I can check with my consulting buddy “Google” and find it. :) Thank you for your contributions - glad that I can give “something” back. I’ll start off by sending the doc to one of the committers, and then if you guys like my work, then we can proceed from there …. SCott On 3/12/14, 9:38 AM, Sebastian Schelter s...@apache.org wrote: Hi Scott, The cms behind the website uses markdown. So ideally you would attach a textfile with markdown formattings to a jira issue and a committer will put that into the website. Does that work for you? PS: There are a lot of online markdown editors out there. On 03/12/2014 03:27 PM, Scott C. Cote wrote: I took the tour of the text analysis and pushed through despite the problems on the page. Commiters helped me over the hump where others might have just gave up (to your point). When I did it, I made shell scripts so that my steps would be repeatable with an anticipation of updating the page. Unforunately, I gave up on trying to figure out how to update the page (there were links indicating that I could do it), and I didn¹t want to appear to be stupid asking how to update the documentation (my bad - not anyone else). Now I know that it was not possible unless I was a commiter. Who should I send my scripts to, or how should I proceed with a current form of the page? SCott On 3/12/14, 5:02 AM, Sebastian Schelter s...@apache.org wrote: Hi Pavan, Awesome that you're willing to help. The documentation are the pages listed under Clustering in the navigation bar under mahout.apache.org If you start working on one of the pages listed there (e.g. the k-Means doc), please created jira ticket in our issue tracker with a title along the lines of Cleaning up the documentation for k-Means on the website. Put a list of errors and corrections into the jira and I (or some other committer) will make sure to fix the website. Thanks, Sebastian On 03/12/2014 08:48 AM, Pavan Kumar N wrote: i ll help with clustering algorithms documentation. do send me old documentation and i will check and remove errors. or better let me know how to proceed. Pavan On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote: Hi, As you've probably noticed, I've put in a lot of effort over the last days to kickstart cleaning up our website. I've thrown out a lot of stuff and have been startled by the amout of outdated and incorrect information on our website, as well as links pointing to nowhere. I think our lack of documentation makes it superhard to use Mahout for new people. A crucial next step is to clean up the documentation on classification and clustering. I cannot do this alone, because I don't have the time and I'm not so familiar with the background of the algorithms. I need volunteers to go through all the pages under Classification and Clustering on the website. For the algorithms, the content and claims of the articles need to be checked, for the examples we need to make sure that everything still works as described. It would also be great to move articles from personal blogs to our website. Imagine that some developer wants to try out Mahout and takes one hour for that in the evening. She will go to our website, download Mahout, read the description of an algorithm and try to run an example. In the current state of the documentation, I'm afraid that most people will walk away frustrated, because the website does not help them as it should. Best, Sebastian PS: I will make my standpoint on whether Mahout should do a 1.0 release depend on whether we manage to clean up and maintain our documentation.