Hi all, I read the Logistic Regression(LR) implementation in Spark and got several questions. Could anyone here give some explanation? 1. The implementation is for dense representation of the feature vectors. But the feature vector is highly sparse in most of the case. So any plan on a version for sparse feature vector? Or any reason to do so intentionally? 2. Any experiments data exists for the convergence performance? The setting of learning rate is tricky, we see a fairly straightforward learning rate update rule in current implementation. 3. Any research work for the practical learning rate setting? As a matter of fact, I implemented a python version of LR with stochastic gradient descent method for sparse feature vector in Spark, and am facing some convergence issue. I failed to get some clues in Tong's work "Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms" and some related papers like "Pegasos: Primal Estimated sub-Gradient solver for SVM". Any suggestions and explanations are appreciated.
Thanks in advance, Jianmin
