Hi Gang, No admin is looking at our patch:( do you have some suggestions so that our patch can get noticed by the admin?
Best regards, Xiaokai On Mon, Jun 30, 2014 at 8:18 PM, Gang Bai [via Apache Spark Developers List] <ml-node+s1001551n713...@n3.nabble.com> wrote: > Thanks Xiaokai, > > I’ve created a pull request to merge features in my PR to your repo. > Please take a review here https://github.com/xwei-datageek/spark/pull/2 . > > As for GLMs, here at Sina, we are solving the problem of predicting the > num of visitors who read a particular news article or watch an online > sports live stream in a particular period. I’m trying to improve the > prediction results by tuning features and incorporating new models. So I’ll > try Gamma regression later. Thanks for the implementation. > > Cheers, > -Gang > > On Jun 29, 2014, at 8:17 AM, xwei <[hidden email] > <http://user/SendEmail.jtp?type=node&node=7131&i=0>> wrote: > > > Hi Gang, > > > > No worries! > > > > I agree LBFGS would converge faster and your test suite is more > comprehensive. I'd like to merge my branch with yours. > > > > I also agree with your viewpoint on the redundancy issue. For different > GLMs, usually they only differ in gradient calculation but the > ****regression.scala files are quite similar. For example, > linearRegressionSGD, logisticRegressionSGD, RidgeRegressionSGD, > poissonRegressionSGD all share quite a bit of common code in their class > implementations. Since such redundancy is already there in the legacy code, > simply merging Poisson and Gamma does not seem to help much. So I suggest > we just leave them as separate classes for the time being. > > > > > > Best regards, > > > > Xiaokai > > > > On Jun 27, 2014, at 6:45 PM, Gang Bai [via Apache Spark Developers List] > wrote: > > > >> Hi Xiaokai, > >> > >> My bad. I didn't notice this before I created another PR for Poisson > regression. The mails were buried in junk by the corp mail master. Also, > thanks for considering my comments and advice in your PR. > >> > >> Adding my two cents here: > >> > >> * PoissonRegressionModel and GammaRegressionModel have the same fields > and prediction method. Shall we use one instead of two redundant classes? > Say, a LogLinearModel. > >> * The LBFGS optimizer takes fewer iterations and results in better > convergence than SGD. I implemented two GeneralizedLinearAlgorithm classes > using LBFGS and SGD respectively. You may take a look into it. If it's OK > to you, I'd be happy to send a PR to your branch. > >> * In addition to the generated test data, We may use some real-world > data for testing. In my implementation, I added the test data from > https://onlinecourses.science.psu.edu/stat504/node/223. Please check my > test suite. > >> > >> -Gang > >> Sent from my iPad > >> > >>> On 2014年6月27日, at 下午6:03, "xwei" <[hidden email]> wrote: > >>> > >>> > >>> Yes, that's what we did: adding two gradient functions to > Gradient.scala and > >>> create PoissonRegression and GammaRegression using these gradients. We > made > >>> a PR on this. > >>> > >>> > >>> > >>> -- > >>> View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7088.html > >>> Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > >> > >> > >> If you reply to this email, your message will be added to the > discussion below: > >> > http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7107.html > >> To unsubscribe from Contributing to MLlib on GLM, click here. > >> NAML > > > > > > > > > > > > -- > > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7117.html > > > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7131.html > To unsubscribe from Contributing to MLlib on GLM, click here > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7033&code=d2VpeGlhb2thaUBnbWFpbC5jb218NzAzM3w2NTc5NDUzMzA=> > . > NAML > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7197.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.