It seems that the Vowpal Wabbit version is most similar to what is in
https://github.com/intel-analytics/TopicModeling/blob/master/src/main/scala/org/apache/spark/mllib/topicModeling/OnlineHDP.scala
Although the Intel seems to implement the Hierarchical Dirichlet Process
(topics and subtopics) as
What machine learning algorithms are you interested in exploring or using?
Start from there or better yet the problem you are trying to solve, and
then the selection may be evident.
On Wednesday, August 5, 2015, praveen S mylogi...@gmail.com wrote:
I was wondering when one should go for MLib
Is velox NOT open source?
On Saturday, June 20, 2015, Debasish Das debasish.da...@gmail.com wrote:
Hi,
The demo of end-to-end ML pipeline including the model server component at
Spark Summit was really cool.
I was wondering if the Model Server component is based upon Velox or it
uses a
Would tachyon be appropriate here?
On Friday, June 5, 2015, Evo Eftimov evo.efti...@isecc.com wrote:
Oops, @Yiannis, sorry to be a party pooper but the Job Server is for Spark
Batch Jobs (besides anyone can put something like that in 5 min), while I
am under the impression that Dmytiy is
Would the IndexedRDD feature provide what the Lookup RDD does?
I'Ve been using a broadcast variable map for a similar kind of thing -- It
probably is within 1GB but interested to know if the lookup (or indexed)
might be better.
C
On Friday, June 5, 2015, Dmitry Goldenberg dgoldenberg...@gmail.com
Dani,
Folding in I believe refers to setting up your Gibbs sampler (or other
model) with the learning word and document topic proportions as computed by
spark.
You might look at
https://lists.cs.princeton.edu/pipermail/topic-models/2014-May/002763.html
Where Jones suggests summing across
Heszak,
I have only glanced at it but you should be able to incorporate tokens
approximating n-gram yourself, say by using the lucene
ShingleAnalyzerWrapper API
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleAnalyzerWrapper.html
You might also take a
Yes,
The case is convincing for PMML with Oryx. I will also investigate
parameter server.
Cheers,
Charles
On Tuesday, November 18, 2014, Sean Owen so...@cloudera.com wrote:
I'm just using PMML. I haven't hit any limitation of its
expressiveness, for the model types is supports. I don't think
Manish and others,
A follow up question on my mind is whether there are protobuf (or other
binary format) frameworks in the vein of PMML. Perhaps scientific data
storage frameworks like netcdf, root are possible also.
I like the comprehensiveness of PMML but as you mention the complexity of
Looking for something like scikit's grid search module.
C
While I can't definitively speak to MLLib online learning,
I'm sure you're evaluating Vowpal Wabbit, for which there's been some storm
integrations contributed.
Also you might look at factorie, http://factorie.cs.understanding.edu,
which at least provides an online lda.
C
On Thursday, June 19,
11 matches
Mail list logo