On Wed, Oct 16, 2013 at 01:40:09AM +0200, Jaques Grobler wrote:
> I'm not 100% on this and would appreciate further input here, as I'm half
> assleep right now! :D
You are pretty much right: partial_fit does not try to iterate the model
to convergence, because it expects to be called many times, p
Hi
As far as I understand it, this seems expected to me. Basically,
partial-fit does online-training on the data.. treating it's 'X' as a
subset of the total data - so it expects to be working with chunks. when
the second partial_fit call is made,
it's not like a normal 'fit' that refits the classi
Hi,
I'm using scikit-learn v0.14.1 and was wondering why classification results
improve just because partial_fit was called a second time on the same
training data? Note in the example below i'm using train/test overlap for
debugging purposes.
Other behavior I don't understand is if train/test sa
2013/10/15 Olivier Grisel :
> 2013/10/15 Alexandre Gramfort :
>>> I did find the part in coordinate_descent.py where alpha_max is chosen, but
>>> I don't fully understand the reasoning behind it:
>>>
>>> alpha_max = np.abs(Xy).max() / (n_samples * l1_ratio)
>>
>> it can be derived from the KKT opti
2013/10/15 Alexandre Gramfort :
>> I did find the part in coordinate_descent.py where alpha_max is chosen, but
>> I don't fully understand the reasoning behind it:
>>
>> alpha_max = np.abs(Xy).max() / (n_samples * l1_ratio)
>
> it can be derived from the KKT optimality conditions of the Lasso probl
> I did find the part in coordinate_descent.py where alpha_max is chosen, but
> I don't fully understand the reasoning behind it:
>
> alpha_max = np.abs(Xy).max() / (n_samples * l1_ratio)
it can be derived from the KKT optimality conditions of the Lasso problem.
A
---
Use the same DictVectorizer that you called fit_transform() on with the
training data, but just call transform() for the test data...
dv = DictVectorizer()
train_feats = dv.fit_transform(train_feature_dict)
test_feats = dv.transform(test_feature_dict)
On 15 October 2013 03:52, Lars Buitinck w
2013/10/14 Osman Baskaya :
> 2- In contrast to #1, I would like to automatically give 0 to those features
> in training data that are not observed in the test data.
>
> If I have a feature that is not observed in the test set, I got error
> because I am using the DictVectorizer so that it does not
I have two questions:
1- I would like to filter features in the test set that aren't observed in
the training set. Is there a way to do it concise way?
2- In contrast to #1, I would like to automatically give 0 to those
features in training data that are not observed in the test data.
If I have
Furthermore, I'm not sure that the API in seqlearn is the right fit for
GaussianHMM, GMMHMM,
and other models with continuous emission distributions. The sklearn HMM API is
really geared
towards unsupervised tasks.
-Robert
On Oct 15, 2013, at 12:57 AM, Lars Buitinck wrote:
> 2013/10/15 Gael V
I will take a look at t some of these tomorrow.
-Robert
On Oct 15, 2013, at 12:24 AM, Olivier Grisel wrote:
> 2013/10/15 Fred Mailhot :
>> On 14 October 2013 20:48, Robert McGibbon wrote:
>> [...]
>>>
>>>
>>> p.s. core devs: pretty please don't remove the HMM code from the scikit :)
>>
>>
>
2013/10/15 Gael Varoquaux :
> What is important is that the functionality get's adopted in another
> package. I don't know the exact scope of seqlearn, but it seems to me
> that it might be a right home for HMMs. Maybe Lars can comment.
I have little time to maintain seqlearn, but a PR is welcome.
2013/10/15 Fred Mailhot :
> On 14 October 2013 20:48, Robert McGibbon wrote:
> [...]
>>
>>
>> p.s. core devs: pretty please don't remove the HMM code from the scikit :)
>
>
> +1E6
Would be great to have volunteers tackling HMM related issues then:
https://github.com/scikit-learn/scikit-learn/se
On Mon, Oct 14, 2013 at 09:30:30PM -0700, Fred Mailhot wrote:
>On 14 October 2013 20:48, Robert McGibbon <[1]rmcgi...@gmail.com> wrote:
> [...]
> p.s. core devs: pretty please don't remove the HMM code from the scikit
> :)
>+1E6
I think that it will happen. It's just a ques
Adding recommendation systems also requires implementing new evaluation
metrics and most likely new grid search tools.
Mathieu
On Mon, Oct 14, 2013 at 7:05 PM, Olivier Grisel wrote:
> Actually the mrec implementation is not the original SLIM algorithm
> but a variant demonstrated by the lib aut
15 matches
Mail list logo