You can ignore it. It just doesn't know for sure you have a pool.
I believe I have even removed this in a recent refactoring.
On Tue, Jul 12, 2011 at 2:21 AM, Salil Apte sa...@offlinelabs.com wrote:
So I keep getting this warning from either Mahout or the server (I'm
guessing the former):
Thanks to all ,
i need to start from the beginning theory ,
you are speaking arab :) to me, or in other words i need
a less theoretical approach, or in other words some real code to put my
hands on.
Excuse this raw approach but i need a real fast to implement and understand
algorithm
to use in
Hi Luca,
again, I have to emphasize read what I gave you.
The algorithm in my link was explained for non-scientists and if you are
going to download Solr you will find the class to have a look on how
they implemented that algorithm.
More easy would mean that someone else is writing the code for
Hi,
When the training data set can be loaded into memory, or each split
can be, what's accuracy of the decision forest algorithm, compared
with LogisticRegression. Do you have production usages with random
forest?
Regards,
Xiaobo Gu
Hi,
The Random Forest partial implementation in
https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
use the ARFF file format, is ARFF the only supportted file format when
using the BuildForest and TestForest program, and are BuildForest and
TestForest program are official
Which version of naivebayes are you using?
bayes.* package or naivebayes.* ?
Former uses text input. Latter one uses vectors.
On Tue, Jul 12, 2011 at 7:59 PM, kevin_ravel ke...@raveldata.com wrote:
I'm a little confused as to the proper way to format the data for training
a
naive bayes
I don't believe that Mahout's random forests have been used in production.
I have heard that some people got pretty good results in testing.
On Tue, Jul 12, 2011 at 6:03 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:
Hi,
When the training data set can be loaded into memory, or each split
can
Hi all,
I am new to Mahout and I am putting up a Recommender for buddycloud (
http://buddycloud.com/) as a part of my GSoC project (
https://github.com/buddycloud/channel-directory).
In the testing snapshot, I got ~100k users, ~20k items and ~230k boolean
taste preferences.
At first I tried an
From what I can see, the random forest implementation takes either numerical
or categorical feature data. That worked fine for me, until I tried to
incorporate word or text features. I liked the encoders used in SGD, but they
don't seem to apply to random forests. So, did I overlook
Hi Ted,
Thanks very much for your very detailed reply. It is very helpful.
still some questions. I hope i am not polluting this email list much..
I understand all your comments except below:
Finally, you should be combining group ranking objective as well as
regression objectives.
thanks. We are trying to get larger dataset. probably over 2000 for each class.
what do you mean by the errors on performance estimates? the confusion matrix?
On Jul 11, 2011, at 2:44 PM, Konstantin Shmakov wrote:
It seems that training data set is way too small. What are the errors
on
Oh yea, at runtime, I'm getting back a BasicDataSource object for my
DataSource. Is that correct?
On Tue, Jul 12, 2011 at 9:59 PM, Salil Apte sa...@offlinelabs.com wrote:
So I started actually looking at performance today and it is pretty
horrendous. I've got about 61,000 rows in my database
12 matches
Mail list logo