Hi:
I am observing a behavior of the scikit.learn implementation of OMP
(sklearn.linear_model.orthogonal_mp) that I don't understand. I am
performing the following experiment:
- Generate a dictionary D (input data) with i.i.d. gaussian entries
(with the column norm normalized to one) with dimensi
2011/10/18 Gael Varoquaux :
> Thanks for everybody's feedback. I have taken it in account and wrote
> some clean code rather than the original hack. The pull request
> https://github.com/scikit-learn/scikit-learn/pull/398
> should be self-explained. In particular the hacks used should now be
> comp
Thanks for everybody's feedback. I have taken it in account and wrote
some clean code rather than the original hack. The pull request
https://github.com/scikit-learn/scikit-learn/pull/398
should be self-explained. In particular the hacks used should now be
comprehensible. Could people review? I'd
2011/10/17 Gael Varoquaux :
> On Mon, Oct 17, 2011 at 12:15:48PM +0200, Lars Buitinck wrote:
>> If you really want to play this kind of tricks, then please use
>> standard C functionality such as frexp() from and appropriate
>> symbolic constants from .
>
> OK, I am not too good at this. Do you th
On Mon, Oct 17, 2011 at 12:30:55PM +0200, Lars Buitinck wrote:
> SSE optimizations don't work on processors without SSE. Again, the
> Cell processor, UltraSPARC, ARM and what have you. This is impossible
> to test thoroughly unless we get multiple buildbots running different
> types of processor, a
On Mon, Oct 17, 2011 at 12:15:48PM +0200, Lars Buitinck wrote:
> -1 for the code in its current state; this is a potential maintenance
> nightmare. The cast to int* violates C aliasing rules, so this might
> break on agressively optimizing compilers. float and int are not
> guaranteed to both the 3
2011/10/17 Olivier Grisel :
> However the compilation flags might get tricky to get right i a cross
> platform manner (also we would need to deal with memory alignment
> stuff which are quite easy to get working on POSIX but I don't know
> under windows).
SSE alignment requires #ifdef magic to get
2011/10/17 Brian Holt :
> +1 even though its not as accurate. If the tests pass, then its accurate
> enough IMHO.
I am +1-ish too. Maybe we need additional tests to make sure it does
not break in weird cases.
Also if transcendental evaluations are expensive in other algorithms
of the scikit, it
2011/10/17 Gael Varoquaux :
> The question is: is it acceptable to have such an approximation? I think
> so, I just wanted confirmation. If people agree with this, I'll document
> it better (and maybe test it) and push to master.
-1 for the code in its current state; this is a potential maintenanc
+1 - great speedup - thanks Gaël!
2011/10/17 Brian Holt :
> +1 even though its not as accurate. If the tests pass, then its accurate
> enough IMHO.
>
>
>
> --
> All the data continuously generated in your IT infrastructu
+1 even though its not as accurate. If the tests pass, then its accurate
enough IMHO.
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, sec
I timed a bit the Entropy criterion of classification tree construction.
It appeared that the log (a transcendal function) was taking up a good
fraction of the time).
I coded a fast log approximation that is a bit brutal:
https://github.com/GaelVaroquaux/scikit-learn/commit/05e707f8dd67eb65948da87
2011/10/17 Robert Layton :
> In the formula for the expected value for mutual information [1], the third
> summation uses n_{i,j}.
> Is this a new value, or do I use the value from the contingency matrix?
In Vinh, Epps and Bailey (2010 [1]), n_{i,j} is the contingency table.
The Wikipedia page see
In the formula for the expected value for mutual information [1], the third
summation uses n_{i,j}.
Is this a new value, or do I use the value from the contingency matrix?
My thinking is that is a new value, as the expected information shouldn't
have anything to do with the contingency matrix, but
14 matches
Mail list logo