Hello all,
before attempting a detailed proposal I would like to discuss the big
picture with you. I went though the two referenced papers and my
feeling is that glmnet as coordinate descent method could be a good
choice especially since the connection with strong rule approach is
already available
I think I found it - but I have to test it again with the whole data set
and let you know.
So when I am using only one tag in the Y for example [1, 1, 1, 1, 1, 1,
1, 1].
it is returing the error I metioned in my first post.
But when I am having something like this [1, 1,1, 1, 1, 1, 1, 2]. It
On Tue, Mar 27, 2012 at 08:20:11PM +0300, Dimitrios Pritsos wrote:
> So Should I send the whole thing or the parts are creating the matrix?
Just save X and y and create a gist that can reproduce the problem
without the external dependencies.
G
Hello Peter,
Yes I can do that but the codes is using a lib I have implemented for
raw HTML to Vector conversion.
So Should I send the whole thing or the parts are creating the matrix?
Regards,
Dimitrios
On 03/27/2012 08:08 PM, Peter Prettenhofer wrote:
> Dimitrios,
>
> please provide an
Hello Vlad,
Yes 18 it is just for Debugging because I have implemented a Locally
Weighted Bag Of Words that requires several Gaussian PDFs to Smooth out
the Data and it is a quite time consuming process. So 18 is just enough
for Debugging. Later will uses about 800 etc.
Train_Y is a list si
Dimitrios,
please provide an example script so that we can reproduce the error.
BTW: gist [1] is a handy tool to distribute scripts.
[1] https://gist.github.com/
best,
Peter
2012/3/27 Dimitrios Pritsos :
>
> Hello,
>
> While I am svm.sparse.SVC with a scipy.sparse.csr_matrix the following
>
Hello Dimitrios
You only have 18 samples? What is the shape of your train_Y?
Best,
Vlad
On Mar 27, 2012, at 19:31 , Dimitrios Pritsos wrote:
>
> Hello,
>
> While I am svm.sparse.SVC with a scipy.sparse.csr_matrix the following error
> occurs:
>
> File
> "/home/dimitrios/Development_Wo
Hello,
While I am svm.sparse.SVC with a scipy.sparse.csr_matrix the following
error occurs:
File
"/home/dimitrios/Development_Workspace/webgenreidentification/src/experiments_lowbow.py",
line 115, in evaluate
csvm.fit(train_X, train_Y)
File
"/usr/local/lib/python2.6/dist-packages/s
2012/3/27 Paolo Losi :
> Gilles,
>
> thank you very much for having checked.
>
> If everyone agrees I'll:
>
> - uncomment extratrees and randomforest benchmark (@pprett is there
> any valid reason to leave them out?)
no, absolutely not - I just forgot to uncomment them - thx
> - explicitly conf
Le 27 mars 2012 14:50, Paolo Losi a écrit :
> Gilles,
>
> thank you very much for having checked.
>
> If everyone agrees I'll:
>
> - uncomment extratrees and randomforest benchmark (@pprett is there
> any valid reason to leave them out?)
They are far slower to run that the other.
Ideally a com
Gilles,
thank you very much for having checked.
If everyone agrees I'll:
- uncomment extratrees and randomforest benchmark (@pprett is there
any valid reason to leave them out?)
- explicitly config max_features=None for RandomForest and ExtraTrees
Thanks again
Paolo
On Tue, Mar 27, 2012 at
Hi,
Using max_features="auto" (default setting) indeed yields the results
that Paolo reports.
When setting max_features=None (i.e., using all features as in our
earlier code), I got the following on my machine:
RandomForest 778.1471s 1.2830s 0.0248
Extra-Trees 1325.2397s 1.3544s 0.01
Interesting - covtype involves a number of categorical attributes
which are represented via a one-hot encoding - do you think that such
a representation has a significant effect on feature sampling and thus
the performance of random forests?
2012/3/27 Gilles Louppe :
> Hi,
>
> I am running the tes
Hi,
I am running the tests again, but indeed I think the difference in the
results comes from that fact that max_features=sqrt(n_features) now by
default whereas it was max_features=n_features before.
Gilles
On 27 March 2012 11:56, Paolo Losi wrote:
> Thanks Peter,
>
> On Tue, Mar 27, 2012 at 1
Thanks Peter,
On Tue, Mar 27, 2012 at 11:34 AM, Peter Prettenhofer <
peter.prettenho...@gmail.com> wrote:
> Paolo,
>
> I noticed that too - maybe @glouppe can comment on this - I think the
> reason was a change in the ``n_features`` heuristic but I might be
> mistaken.
>
Gilles, can you give a q
Paolo,
I noticed that too - maybe @glouppe can comment on this - I think the
reason was a change in the ``n_features`` heuristic but I might be
mistaken.
Concerning the GaussianNB - there's a PR [1] adressing a critical bug
in the estimator - it should be merged ASAP. Furthermore, test time is
qu
Thanks a lot. I've let the author know
J
Le 26 mars 2012 14:14, Jaques Grobler a ?crit :
>
> > Hi everyone-
>
> >
>
> > I stumbled upon this post that offers a quick run-trough of
>
> > text-feature-extraction using
>
> > sklearn.feature_extraction.text's?CountVectorizer:
>
> >
>
> >
>
> > http:
On 03/27/2012 12:41 AM, David Warde-Farley wrote:
> On Mon, Mar 26, 2012 at 11:38:51PM +0200, Gael Varoquaux wrote:
>
>
>> Also, for more senior contributors, if you feel like being a mentor,
>> don't hesitate to contact me. It would be great to have a fair number of
>> prospective mentors with
18 matches
Mail list logo