etter) then that is already very interesting.
Raphael
> > On Mar 22, 2016, at 7:52 AM, Raphael C >
> wrote:
> >
> >>
> >> - In tree-based Not handling categorical variables as such hurts us a
> lot
> >> There's a PR to fix that, it still needs a b
>
> - In tree-based Not handling categorical variables as such hurts us a lot
> There's a PR to fix that, it still needs a bit of love:
> https://github.com/scikit-learn/scikit-learn/pull/4899
>
This is a conversation moved from
https://github.com/scikit-learn/scikit-learn/pull/4899 .
In the
This paper about xgboost came out recently which I thought might be of interest.
http://arxiv.org/pdf/1603.02754v1.pdf
Raphael
--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel
,
filled=True, rounded=True,
special_characters=True)
graph = pydot.graph_from_dot_data(dot_data.getvalue())
Image(graph.create_png()
Raphael
On 12 March 2016 at 13:56, Raphael C wrote:
> I am attempting to draw a decision tree using:
>
I am attempting to draw a decision tree using:
reg = DecisionTreeRegressor(max_depth=None,min_samples_split=1)
reg.fit(X,Y)
dot_data = StringIO()
tree.export_graphviz(reg, out_file=dot_data,
feature_names=feature_names,
filled=True, rounded=True,
On 8 November 2015 at 20:42, Sebastian Raschka wrote:
> Hm, I have to think about this more. But another case where I think that the
> handling of categorical features could be useful is in non-binary trees; not
> necessarily while learning but in making predictions more efficiently. E.g.,
> as
On 8 November 2015 at 17:50, Sebastian Raschka wrote:
>
>> On Nov 8, 2015, at 11:32 AM, Raphael C wrote:
>>
>> In terms of computational efficiency, one-hot encoding combined with
>> the support for sparse feature vectors seems to work well, at least
>> for me.
On 5 November 2015 at 13:38, Gael Varoquaux
wrote:
> On Thu, Nov 05, 2015 at 07:05:11AM +0000, Raphael C wrote:
>> https://github.com/szilard/benchm-ml
>
>> The upshot is that in some cases it seems that the scikit-learn
>> versions have room for improvement.
>
> T
I don't know if this has been widely seen, but there is an interesting
comparison of classifiers from different machine learning libraries
at:
https://github.com/szilard/benchm-ml
The upshot is that in some cases it seems that the scikit-learn
versions have room for improvement. I don't know how
On 25 October 2015 at 19:44, olologin wrote:
> On 10/25/2015 08:12 PM, Raphael C wrote:
>>
>> From my quick reading of the thread it seems that people aren't
>> convinced LambdaMART is very good in practice. Is that right/wrong?
>>
>> Raphael
>>
>
https://github.com/scikit-learn/scikit-learn/pull/2580 is the PR but it
seems to have reached an unfortunate impasse.
>From my quick reading of the thread it seems that people aren't convinced
LambdaMART is very good in practice. Is that right/wrong?
Raphael
On 25 Oct 2015 16:46, "olologin" wro
I have a training set, a validation set and a test set. I build a
random forest using RandomForestClassifier on the training set.
However, I would like to tune it by scoring on the validation set.
I find that the cross-validation score on the training set is a lot
better than the score on the
12 matches
Mail list logo