Hi,
I've used Lucene a fair bit and one useful feature it has is the ability to
boost fields to make them more relevant. E.g. matching Titles are more
important than matching descriptions, so you can "boost" title fields to
ensure they weigh in more in the final relevance calculation.
I expected
I will re-implementing the serialization in C++
Thanks a lot.
-邮件原件-
发件人: Ted Dunning [mailto:ted.dunn...@gmail.com]
发送时间: 2011年7月6日 10:48
收件人: user@mahout.apache.org
主题: Re: How could I use bayse model with my C++ online classifier
Well, PMML is the (complicated) standard solution.
Ot
Well, PMML is the (complicated) standard solution.
Otherwise, a Naive Bayes model would probably fit as CSV data.
But seriously, it isn't that hard to read a sequence file. Re-implementing
our serialization in C++ would be generally useful as well.
On Tue, Jul 5, 2011 at 7:38 PM, Lance Norskog
Is there a standard text format that would support this data? ARFF, for example?
On Mon, Jul 4, 2011 at 7:57 PM, beneo_7 wrote:
> read the java source code and implemenet it in c++
>
> 我也不明白为啥你要用阿里巴巴的邮箱
>
> 2011-07-05
>
>
>
> beneo_7
>
>
>
> 发件人: 刘逸哲
> 发送时间: 2011-07-05 10:55
> 主 题: How could I
I'm recruiting Engineers with Machine Learning expertise at Meebo.
We have openings for Chief Engineer level through new graduates. We're
looking for folks with deep knowledge of how machine learning can be
applied in social networking and related advertising applications.
We're open to these folk
Hello,
the answer of Vijay's question would be insteresting to me too, since I
should use OnlineLogisticRegression in order to calculate probabilities
(as far as I see, there are no probability calculation functions in
AdaptiveLogisticRegression). So, for example, how to determine 'number
of
Glad we could help.
On Tue, Jul 5, 2011 at 7:09 AM, Radek Maciaszek wrote:
> Hello,
>
> I worked in the past on MSc project which involved quite a lot of Mahout
> calculation. I finished it a while ago but only recently got my head around
> posting it somewhere online.
>
> It would be much more d
Glancing at the code, I think that the big problem is likely to be the
number of features in the encoded model.
You only have a tiny number of features in the hashed representation so you
are going to have a LOT of collisions. You need to have considerably more
dimensions in your encoded feature
Hello,
I worked in the past on MSc project which involved quite a lot of Mahout
calculation. I finished it a while ago but only recently got my head around
posting it somewhere online.
It would be much more difficult to finish this work without the help from
this list so I wanted to say thank you
Agreed. This matrix could be decomposed in your browser in javascript ...
or these days, on your phone.
-jake
On Jul 5, 2011 1:12 AM, "Ted Dunning" wrote:
Lanczos is probably dominated by overhead and startup costs on such a small
matrix. You only have 100,000 non-zreo elements which is a t
Erm, yes. What is your question?
On Tue, Jul 5, 2011 at 1:30 PM, rmx wrote:
> Is this project still alive??
> Please...
> Thanks
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Using-with-seq2spars-org-apache-lucene-analysis-Analyzer-tp3108497p3140576.html
> Sent from
Is this project still alive??
Please...
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Using-with-seq2spars-org-apache-lucene-analysis-Analyzer-tp3108497p3140576.html
Sent from the Mahout User List mailing list archive at Nabble.com.
Hi Ted,
I've uploaded my code to https://gist.github.com/1064551
I bought Mahout in Action and am using your ContinuousValueEncoder and other
misc classes, but as you can see I've hardcoded most of the training data.
Yes, there are very few training samples, but from what I understand, I can
rei
Lanczos is probably dominated by overhead and startup costs on such a small
matrix. You only have 100,000 non-zreo elements which is a truly tiny
problem. Stochastic projection SVD, for instance would compute the answer
for such a problem in a few milliseconds.
You need a much larger problem to
How many training examples do you have?
Sounds like you have very few. That is definitely not the sweet spot for
on-linear regression.
In any case, can you post your test code to github or something?
On Mon, Jul 4, 2011 at 11:46 AM, Vijay Santhanam
wrote:
> Thank you Ted
>
> However, even with
I committed a change to make the parsing bits I found in .bayes. use space
and tab. You can try again. I confess I don't know this code and there's a
lot of little pieces of parsing here and there so don't know if this is the
heart of the issue.
On Mon, Jul 4, 2011 at 4:08 PM, Vijay Santhanam
wrot
16 matches
Mail list logo