I am on the fence still - internship this summer so I need to check on
timing/vacation expectation
On Mon, Mar 7, 2016 at 3:09 PM, Jacob Vanderplas
wrote:
> I'm not going to be able to make it this year, unfortunately.
> Jake
>
> Jake VanderPlas
> Senior Data Science Fellow
> Director of Re
Is julia-learn a thing already? Juliasklearn seems a bit overloaded to me,
but naming things is hard.
On Mon, Mar 7, 2016 at 11:02 AM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Mon, Mar 07, 2016 at 10:54:53AM -0500, Andreas Mueller wrote:
> > I'm not sure about the naming of the
Any RL package will have be heavily focused on non-iid data (timeseries,
basically) with the additional difficulty of the agent
effecting/interacting with the environment it is operating in. I agree with
you Gael - many packages for "deep learning" also don't handle this type of
data/these models (
When I was (and still am, sometimes) hacking Circle CI support for
sklearn-theano (https://github.com/sklearn-theano/sklearn-theano/pull/93)
it had an option to have access only to 1 project.
I have been debugging via logs, but there must be a better way, cause it is
really a pain in the neck to d
IncrementalPCA should get closer to "true" PCA as the number of
components increases - so if anything the solution should be more
stable rather than less. The difference mostly lies in the incremental
processing - regular PCA with reduced components performs the full
PCA, then only keeps a subset o
I did a piece of that in the Titanic examples from the SciPy tutorial,
but it could definitely use a more thorough and clear example. This
version could probably be simplified/streamlined - much of my
preprocessing was done with straight numpy, and I am 90% sure there is
a more "sklearn approved" w
If people are planning to work on this, it would be good to check what
Andy and I presented at SciPy, which is based on what Jake and Olivier
did at PyCon (and what Andy, Jake and Gael did at SciPy 2013, etc.
etc.).
To Sebastian's points - we covered all of these nearly verbatim except
perhaps cla
Congratulations - well deserved, and thanks for all your hard work!
On Wed, Sep 23, 2015 at 6:47 AM, Arnaud Joly wrote:
> Congratulation and welcome !!!
>
> Arnaud
>
>
>> On 23 Sep 2015, at 08:59, Gael Varoquaux
>> wrote:
>>
>> Welcome to the team. You've been doing awesome work. We are very lo
ice for a larger number of
> parameters like RBM but it would also involve MCMC iterations. Any
> thoughts?
>
> On Mon, Jul 27, 2015 at 6:18 AM, Kyle Kastner
> wrote:
>
>> RBMs are a factorization of a generally intractable problem - as you
>> mention it is still
RBMs are a factorization of a generally intractable problem - as you
mention it is still O(n**2) but much better than the combinatorial brute
force thing that the RBM factorization replaces. There might be faster RBM
algorithms around but I don't know of any faster implementations that don't
use GP
Another citation for hebbian approach - it is related to this
http://onlinelibrary.wiley.com/doi/10.1207/s15516709cog0901_5/pdf
On Thu, Jun 18, 2015 at 10:25 AM, Kyle Kastner
wrote:
> Yes agreed - though I would also guess the intermediate memory blowup
> could help speed, though I h
ger is that
> it doesn't do fancy indexing and avoids large intermediate arrays.
>
>
>
> On 06/18/2015 10:09 AM, Kyle Kastner wrote:
>
> I don't know if it is faster or better - but the learning rule is
> insanely simple and it is hard to believe there could be
You can also see the kmeans version it here:
https://github.com/kastnerkyle/ift6268h15/blob/master/hw3/color_kmeans_theano.py#L23
Though I guarantee nothing about my homework code!
On Thu, Jun 18, 2015 at 10:09 AM, Kyle Kastner
wrote:
> I don't know if it is faster or better - but the
t might be "too easy" to have a real paper.
On Thu, Jun 18, 2015 at 9:58 AM, Andreas Mueller wrote:
>
>
> On 06/18/2015 09:48 AM, Kyle Kastner wrote:
> > This link should work http://www.cs.toronto.edu/~rfm/code.html
> > <http://www.cs.toronto.edu/%7Erfm/code.ht
This link should work http://www.cs.toronto.edu/~rfm/code.html
On Thu, Jun 18, 2015 at 9:38 AM, Kyle Kastner wrote:
> Minibatch K-means should work just fine. Alternatively there are hebbian
> K-means approaches which are quite easy to implement and should be fast
> (though I
Minibatch K-means should work just fine. Alternatively there are hebbian
K-means approaches which are quite easy to implement and should be fast
(though I think it basically boils down to minibatch K-means, I haven't
looked at details of minibatch K-means). There is an approach here
http://www.iro.
Data preprocessing is important. One thing you might want to do is get
your preprocessing scaling values over the training data - technically
getting the value over the whole dataset is not valid as that includes
the test data.
It is hard to say whether 100% is believable or not, but you should
pr
line so it is more specific
>> than "Re: Contents of Scikit-learn-general digest..."
>>
>>
>> Today's Topics:
>>
>>1. Re: Scikit-learn-general Digest, Vol 63, Issue 34
>> (Al
IF it was in scipy would it be backported to the older versions? How
would we handle that?
On Wed, Apr 15, 2015 at 3:40 PM, Olivier Grisel
wrote:
> We could use PyPROPACK if it was contributed upstream in scipy ;)
>
> I know that some scipy maintainers don't appreciate arpack much and
> would lik
dit your Subject line so it is more specific
> than "Re: Contents of Scikit-learn-general digest..."
>
>
> Today's Topics:
>
>1. Re: pydata (Andreas Mueller)
>2. Robust PCA (Andreas Mueller)
>3. Re: Robust PCA
Robust PCA is awesome - I would definitely like to see a good and fast
version. I had a version once upon a time, but it was neither good
*or* fast :)
On Wed, Apr 15, 2015 at 10:33 AM, Andreas Mueller wrote:
> Hey all.
> Was there some plan to add Robust PCA at some point? I vaguely remember
> a
I have a simple nesterov momentum in Theano modified from some code
Yann Dauphin had, here:
https://github.com/kastnerkyle/ift6266h15/blob/master/normalized_convnet.py#L164
On Tue, Apr 7, 2015 at 10:44 AM, Andreas Mueller wrote:
> Actually Olivier and me added some things to the MLP since then, a
Just FYI, someday when the Github move happens there are a few tweaks
to the build that will have to happen - Github has some special rules
on folder names. I had to make some mods to the build for
sklearn-theano, it wasn't awful but took a while to figure out
On Mon, Mar 30, 2015 at 2:24 PM, Andr
Awesome! Congratulations all who contributed to this - lots of great stuff.
On Fri, Mar 27, 2015 at 12:26 PM, Olivier Grisel
wrote:
> Release highlights and full changelog available at:
>
> http://scikit-learn.org/0.16/whats_new.html
>
> You can grab it from the source here:
>
> https://pypi.pyth
e good results.
>
> Christof
>
> On 20150324 21:01, Kyle Kastner wrote:
>> It might be nice to talk about optimizing runtime and/or training time
>> like SMAC did in their paper. I don't see any reason we couldn't do
>> this in sklearn, and it might be of v
makes sense).
>
> Btw, this paper has a couple of references for more detailed equations:
> http://www.aaai.org/Papers/IJCAI/2007/IJCAI07-449.pdf
>
>
> On 03/25/2015 03:20 PM, Kyle Kastner wrote:
>> There was mention of TDP (blocked Gibbs higher up in the paper) vs
>> colla
eas Mueller wrote:
>>
>>
>>
>> On 03/24/2015 09:44 PM, Kyle Kastner wrote:
>> >
>> > Will users be allowed to set/tweak the burn-in and lag for the sampler
>> > in the DPGMM?
>> >
>> This is variational!
>>
>>
>> ---
to see what happens.
>
> --
> João Felipe Santos
>
> On 24 March 2015 at 20:25, Kyle Kastner wrote:
>>
>> How did you install it?
>> python setup.py develop or install? Did you have to use --user?
>>
>> On Tue, Mar 24, 2015 at 7:41 PM, João Felipe Santos
&g
I like the fact that this can broken into nice parts. I also think
documentation should be farther up the list, and math part lumped in.
GMM cleanup should probably start out of the gate, as fixing that will
define what API/init changes have to stay consistent in the other two
models.
Is there any
I would focus on the API of this functionality and how/what users will
be allowed to specify. To me, this is a particularly tricky bit of the
PR. As Vlad said, take a close look at GridSearchCV and
RandomizedSearchCV and see how they interact with the codebase. Do you
plan to find good defaults for
How did you install it?
python setup.py develop or install? Did you have to use --user?
On Tue, Mar 24, 2015 at 7:41 PM, João Felipe Santos wrote:
> Hi,
>
> I am using MKL with Numpy and Scipy on a cluster and just installed
> scikit-learn. The setup process goes without any issue, but if I try t
ue, Mar 24, 2015 at 5:08 PM, Kyle Kastner wrote:
> That said, I would think random forests would get a lot of the
> benefits that deep learning tasks might get, since they also have a
> lot of hyperparameters. Boosting tasks would be interesting as well,
> since swapping the estimator
implement.
On Tue, Mar 24, 2015 at 5:01 PM, Kyle Kastner wrote:
> It might be nice to talk about optimizing runtime and/or training time
> like SMAC did in their paper. I don't see any reason we couldn't do
> this in sklearn, and it might be of value to users since we don't
&
It might be nice to talk about optimizing runtime and/or training time
like SMAC did in their paper. I don't see any reason we couldn't do
this in sklearn, and it might be of value to users since we don't
really do deep learning as Andy said.
On Tue, Mar 24, 2015 at 4:52 PM, Andy wrote:
> On 03/2
I am also interested in Mondrian Forests (and partial_fit methods for
things in general), though I thought one of the issues for
implementing either of these methods was the way our trees are
currently built would make it hard to extend to these two algorithms.
It is definitely important not to reg
We can probably also email one of the organizers (I think they are
listed on the site?) and find out if we can edit or add an addendum.
It is strange - I am almost 100% positive we could edit the proposals
in past years.
Kyle
On Wed, Mar 11, 2015 at 10:22 AM, Andreas Mueller wrote:
> Unfortunate
I think finding one method is indeed the goal. Even if it is not the best
every time, a 90% solution for 10% of the complexity would be awesome. I
think GPs with parameter space warping are *probably* the best solution but
only a good implementation will show for sure.
Spearmint and hyperopt exist
s are one hour for the first part and two hours for
> > the second part.
> > Was the rest exercises or just not recorded?
>
> > Cheers,
> > Andreas
>
>
> > On 02/25/2015 09:21 PM, Kyle Kastner wrote:
> > > That is a great idea. We should definite
I added some +1 to #4234 and #4325 . Surprised the RBM one exists!
That is something that seems to happen a lot with those types of
models and can be tricky to find.
On Tue, Mar 3, 2015 at 2:16 PM, Olivier Grisel wrote:
> Hi all,
>
> We are a bit late on the initial 0.16 beta release schedule bec
first part and two hours for
> the second part.
> Was the rest exercises or just not recorded?
>
> Cheers,
> Andreas
>
>
> On 02/25/2015 09:21 PM, Kyle Kastner wrote:
> > That is a great idea. We should definitely get a list of people who
> > are attending and t
ial days (but I hopefully will make it for the main
> conference),
>Jake
>
> Jake VanderPlas
> Director of Research – Physical Sciences
> eScience Institute, University of Washington
> http://www.vanderplas.com
>
> On Wed, Feb 25, 2015 at 2:38 PM, Kyle Kastner wr
I am working on one now. Hoping to go even if rejected, for sprint and
meeting up
On Wed, Feb 25, 2015 at 9:51 AM, Andy wrote:
> Hey everybody.
> Is anyone going to / submitting talks to scipy?
> My institute (or rather Moore-Sloan) is a sponsor so they'll sent me :)
>
> Cheers,
> Andy
>
> -
There are a lot of ways to speed them up as potential work, but the
interface (and backend code) should be very stable first. Gradient based,
latent variable approximation, low-rank updating, and distributed GP (new
paper from a few weeks ago) are all possible, but would need to be compared
to a ve
GSoC wise it might also be good to look at CCA, PLS etc. for cleanup.
On Feb 12, 2015 2:02 AM, "Kyle Kastner" wrote:
> Plugin vs separate package:
> libsvm/liblinear are plugins whereas "friend" libraries like lightning are
> packages right?
>
> By that defini
Plugin vs separate package:
libsvm/liblinear are plugins whereas "friend" libraries like lightning are
packages right?
By that definition I agree with Gael - standalone packages are best for
that stuff. I don't really know what a "plugin" for sklearn would be
exactly.
On Feb 12, 2015 1:58 AM, "Gae
wrote:
> no i mean external plugin that they have to support - we're hands off. we
> can link to it but that's it - no other guarantees like we've done in the
> past iirc
>
> On Thu, Feb 12, 2015 at 1:48 AM, Kyle Kastner
> wrote:
>
>> Even having a s
let other packages focus on that.
On Feb 12, 2015 1:48 AM, "Kyle Kastner" wrote:
> Even having a separate plugin will require a lot of maintenance. I am -1
> on any gpu stuff being included directly in sklearn. Maintenance for
> sklearn is already tough, and trying to su
Even having a separate plugin will require a lot of maintenance. I am -1 on
any gpu stuff being included directly in sklearn. Maintenance for sklearn
is already tough, and trying to support a huge amount of custom compute
hardware is really, really hard. Ensuring numerical stability between
OS/BLAS
pylearn2 is not even close to sklearn compatible. Small scale recurrent
nets are in PyBrain, but I really think that any seriously usable neural
net type learners are sort of outside the scope of sklearn. Others might
have different opinions, but this is one of the reasons Michael and I
started skl
Could it also be accounting for +- ? Standard deviation is one sided right?
On Thu, Feb 5, 2015 at 4:54 PM, Joel Nothman wrote:
> With cv=5, only the training sets should overlap. Is this adjustment still
> appropriate?
>
> On 6 February 2015 at 06:44, Michael Eickenberg <
> michael.eickenb...@g
IncrementalPCA is done (have to add randomized SVD solver but that should
be simple), but I am sure there are other low rank methods which need a
partial_fit . I think adding partial_fit functions in general to as many
algorithms as possible would be nice
Kyle
On Thu, Feb 5, 2015 at 2:12 PM, Aksh
I think most of the GP related work is deciding what the sklearn compatible
interface should be :) specifically how to handle kernels and try to share
with core codebase.
The HODLR solver of George could be very nice for scalibility but algorithm
is not easy. There are a few other options on that
Sounds like an excellent improvement for usability!
If you could benchmark time spent, and show that it is a noticeable
improvement that will be crucial. Also showing how bad the approximation is
compared to base t-SNE will be important - though there comes a point where
you can't really compare,
I haven't looked closely, but is Barber's data format considered to be
examples as columns, or examples as rows? That difference is usually what I
see in a bunch of different SVD based algorithms. It is very annoying when
reading the literature.
aka.
what Michael said
On Fri, Dec 5, 2014 at 10:2
1.3M x1.3M which should blow up most current memory
> sizes for all data types. Sparsity in the output can redeem this.
>
> On Thursday, November 27, 2014, Kyle Kastner
> wrote:
>
>> On a side note, I am semi-surprised that allowing the output of the dot
>> to be sparse &
On a side note, I am semi-surprised that allowing the output of the dot to
be sparse "just worked" without crashing the rest of it...
On Thu, Nov 27, 2014 at 12:19 PM, Kyle Kastner
wrote:
> If your data is really, really sparse in the original space, you might
> also look at
If your data is really, really sparse in the original space, you might also
look at taking a random projection (I think projecting on sparse SVD basis
would work too?) as preprocessing to "densify" the data before calling the
cosine similarity. You might get a win on feature size with this, dependi
Gradient based optimization (I think this might be related to the recent
variational methods for GPs) would be awesome.
On Tue, Nov 25, 2014 at 12:54 PM, Mathieu Blondel
wrote:
>
>
> On Wed, Nov 26, 2014 at 2:37 AM, Andy wrote:
>
>>
>> What I think would be great to have is gradient based optim
For the API, the naming is somewhat non-standard, it is not super clear
> what the parameters mean, and it is also not super clear
> whether the kernel-parameters will be optimized for a given parameters
> setting.
>
>
>
>
> On 11/25/2014 12:28 PM, Gael Varoquaux wrote:
&g
I have some familiarity with the GP stuff in sklearn, but one of the big
things I really *want* is something much more like George - specifically a
HODLR solver. Maybe it is outside the scope of the project, but I think GPs
in sklearn could be very useful and computationally tractable for "big-ish"
I will be there for everything - glad to meet up before, during, and after!
Be warned it already started snowing here and is pretty cold... feels like
-10 C today according to weather.com.
On Tue, Nov 18, 2014 at 11:40 AM, Andy wrote:
> Hey.
>
> I'll be there and talking at the machine learning
In addition to the y=None thing, KDE doesn't have a transform or predict
method - and I don't think Pipeline supports score or score_samples. Maybe
someone can comment on this, but I don't think KDE is typically used in a
pipeline.
In this particular case the code *seems* reasonable (and I am surp
contributing more.
>>>
>>> On Sun, Oct 12, 2014 at 5:24 PM, Gael Varoquaux
>>> wrote:
>>>>
>>>> I am happy to welcome new core contributors to scikit-learn:
>>>> - Alexander Fabisch (@AlexanderFabisch)
>>>> - Kyle Kas
To be honest - updating python packages on CentOS is a nightmare. The
whole OS is pretty strongly dependent on python version, which I
believe is up to 2.6 now (2.4 in 5.x!). In my experience CentOS is the
worst Linux OS for development (heavily locked down, hard to add
packages, yum is annoying, e
I started some code here long ago
(https://gist.github.com/kastnerkyle/8143030) that isn't really
finished or cleaned up - maybe it can give you some ideas/advice for
implementing? I never got a chance to clean this up for PR, and it
doesn't look like I will have time in the near future so your PR
I agree as well. Maybe default to everything other than validation
private? Then see what people want to become public? Don't know what
nilearn is using but that should obviously be public too...
On Mon, Sep 8, 2014 at 5:17 PM, Olivier Grisel wrote:
> +1 as well for the combined proposal of Gael
Shubham,
There are many open improvements on the GitHub issues list
(https://github.com/scikit-learn/scikit-learn/issues?q=is%3Aopen+is%3Aissue+label%3AEasy).
I recommend starting with a few of the Easy or Documentation tasks -
it helps get the workflow down and is also very valuable to the
projec
I have copped and heavily modified the website for use on another project (
https://github.com/sklearn-theano/sklearn-theano), and I would like to
change the color scheme. Does anyone with experience modifying the website
know how to do this? Is there an easier way besides hand modifying the css?
M
This sounds interesting - what do you normally use it for? Do you have
any references for papers to look at?
On Wed, Aug 27, 2014 at 12:44 PM, Mark Stoehr wrote:
> Hi scikit-learn,
>
> Myself and a colleague put together an implementation of the EM algorithm
> for mixtures of multivariate Bernoul
As far as I know, the typical idea is to keep things as readable as
possible, and only optimize the "severe/obvious" type bottlenecks (things
like memory explosions, really bad algorithmic complexity, unnecessary data
copy, etc).
I can't really comment on your "where do the bottlenecks go" questio
I did not see your earlier script... now I am interested. I have been
hacking on it but don't know what is going on yet.
On Thu, Jul 31, 2014 at 4:25 PM, Deepak Pandian
wrote:
> On Thu, Jul 31, 2014 at 7:49 PM, Kyle Kastner
> wrote:
> > It looks like the transpose may make
It looks like the transpose may make the system under-determined. If you
try with
X = np.random.randn(*X.shape)
What happens?
On Thu, Jul 31, 2014 at 4:17 PM, Kyle Kastner wrote:
> What is the shape of X
>
>
> On Thu, Jul 31, 2014 at 4:14 PM, Deepak Pandian > wrote:
>
What is the shape of X
On Thu, Jul 31, 2014 at 4:14 PM, Deepak Pandian
wrote:
> On Thu, Jul 31, 2014 at 7:31 PM, Olivier Grisel
> wrote:
> > The sign of the components is not deterministic. The absolute values
> > should be the same.
>
> But the last component differs even in absolute values,
OK - what is the result of X.shape and X.dtype? What is X?
On Thu, Jul 10, 2014 at 1:55 PM, Sheila the angel
wrote:
> Yes, the error is in fit(X,y)
>
> clf.fit(X,y)
>
> ---
> Traceback (most recent call last):
>
> File ""
What was the error? Posting a traceback would help us help you.
On Thu, Jul 10, 2014 at 11:45 AM, Sheila the angel
wrote:
> What is the correct way to use different metric in KNeighborsClassifier ?
>
> I tried this
>
> clf = KNeighborsClassifier(metric="mahalanobis").fit(X, y)
>
> which give
It looks like fit_params are passed wholesale to the classifier being fit -
this means the sample weights will be a different size than the fold of (X,
y) fed to the classifier (since the weights aren't getting KFolded...).
Unfortunately I do not see a way to accomodate for this currently -
sample_
s kind of
stuff... hopefully someone who knows the web server can help.
On Mon, Jul 7, 2014 at 12:05 PM, Nelle Varoquaux
wrote:
>
>
>
> On 7 July 2014 10:01, Kyle Kastner wrote:
>
>> If by symlink you are talking about Linux symlinks (dunno if there are
>> others)
I have been doing a lot of work on tensors, and there are many different
datasets which have this property. Places to look are image tracking,
tensor decomposition, multi-way analysis, chemisty(!), physics, etc. A
quick list of some sites I have put together:
http://three-mode.leidenuniv.nl/
http:
If by symlink you are talking about Linux symlinks (dunno if there are
others), you can do:
ln -sf source dest
i.e.
ln -sf /path/to/0.14/docs current_place
to force update it, but buyer beware.
On Mon, Jul 7, 2014 at 11:11 AM, Nelle Varoquaux
wrote:
> Hello everyone,
>
> A couple of months
You should probably read the paper: Training Highly Multiclass Classifiers
http://jmlr.org/papers/v15/gupta14a.html
That said, I think you could gain a lot of value by looking into
hierarchical approaches - training a bunch of small classifiers on subsets
of the overall data to subselect the "righ
The easiest way is to just map them yourself with some Python code after
LabelEncoder - this type of mapping is generally application specific.
Something like:
a[a == 0] = 100
a[a == 1] = 150
a[a == 2] = 155
will do the trick. For many labels, you could loop through a dictionary you
make and set
Is this necessary for new PCA methods as well? In other words, should I add
an already deprecated constructor arg to IncrementalPCA as well, or just do
the whitening inverse_transform the way it will be done in 0.16 and on?
On Mon, Jun 30, 2014 at 3:20 PM, Gael Varoquaux <
gael.varoqu...@normales
You need to set n_components to something smaller than n_features (where
n_features is X.shape[1] if X is your data matrix) for PCA - by default it
does not drop any components, and simply projects to another space. A lot
of examples use n_components=2, then do a scatter plot to see the
separation
Sent you an email - I know of at least one possibility.
On Fri, Jun 20, 2014 at 2:46 PM, Andy wrote:
> Hey Everyone.
>
> Does anyone by any chance have a spare bed / couch?
>
> Cheers,
> Andy
>
> On 06/08/2014 01:47 PM, Alexandre Gramfort wrote:
> > hi everyone,
> >
> > time to reactivate this
Do you have any references for this technique? What is it typically used
for?
On Wed, Jun 18, 2014 at 12:26 PM, Dayvid Victor
wrote:
> Hi there,
>
> Is anybody working on an Instance Reduction module for sklearn?
>
> I started working on those and I already have more than 10 IR (PS and PG)
> al
ange/18760-partial-least-squares-and-discriminant-analysis/content/html/learningpcapls.html#3
for more details
On Thu, Jun 5, 2014 at 12:04 PM, Kyle Kastner wrote:
> I am planning to work on NIPALS after the 0.15 sklearn release - there
> are several good papers I will try to work wi
I am planning to work on NIPALS after the 0.15 sklearn release - there are
several good papers I will try to work with and implement.
Simple, high level description:
http://www.vias.org/tmdatanaleng/dd_nipals_algo.html
Simple MATLAB (will start with this first likely):
http://www.cdpcenter.org/fi
;ve definitely run regressions
> with larger matrices in the past, and haven't had issues before. This is on
> a cluster with ~94 gigs of ram, and in the past I've exceeded this limit
> and it has usually thrown an error (one of our sysadmin's scripts), not
> silently hung.
&g
ution :)
On Tue, May 27, 2014 at 3:48 PM, Kyle Kastner wrote:
> What is your overall memory usage like when this happens? Sounds like
> classic memory swapping/thrashing to me - what are your system specs?
>
> One quick thing to try might be to change the dtype of the matrices to
> save
What is your overall memory usage like when this happens? Sounds like
classic memory swapping/thrashing to me - what are your system specs?
One quick thing to try might be to change the dtype of the matrices to save
some space. float32 vs float64 can make a large memory difference if you
don't nee
This looks like it would fix the issue with autochosen n_nonzero_coefs -
which is great! After reading the paper mentioned in the docstring, I can
see where the Gram matrix calculation is coming from now, but I think the
check
if tol is None and n_nonzero_coefs > len(Gram):
raise ValueErr
additionally, why is 'gram = np.dot(dictionary, dictionary.T)' used for the
OMP? According to the docstring on orthogonal_mp_gram it should be X.T * X
. This would also make gram (n_features, n_features) and make the size
checks work...
On Thu, May 15, 2014 at 10:01 AM, Kyle Kas
this particular case. I hit
this problem when I started testing on a different dataset - got the
initial implementation on images, and am now trying to use it for signals.
Thanks for looking into this - hopefully it is just confusion on my part
On Thu, May 15, 2014 at 1:55 AM, Gael Varoquaux <
gael.v
I am having some issues with sparse_encode, and am not sure if it is a bug
or my errror. In implementing a KSVDCoder, I have gotten something which
appears to work on one dataset. However, when I swap to a different
dataset, I begin to get errors, specifically:
ValueError: The number of atoms cann
0, 2, 0, 0, 1, 1, 1, 1],
>> dtype=int64),
>> array([0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0]))
>>
>> In this example, singal 0 should've never shown up (since its
>> probability is 0 in both states). However, it is the only signal emitted by
>>
If you are manually specifying the emission probabilities I don't think
there are any hooks/asserts to guarantee that variable is normalized I.E.
if you assign to the emissionprob_ instead of using the fit() function, I
think it is on you to make sure the emission probabilities you are
assigning *a
work was ever completed. Maybe someone else
has a better knowledge of this.
On Mon, Apr 7, 2014 at 2:39 PM, Kyle Kastner wrote:
> You can also use the python interface to pylearn2, rather than the yaml.
> If you are interested in examples of the python interface for pylearn2, I
> have
You can also use the python interface to pylearn2, rather than the yaml. If
you are interested in examples of the python interface for pylearn2, I have
some examples (I greatly prefer the python interface, but to each their
own):
https://github.com/kastnerkyle/pylearn2-practice/blob/master/cifar10
something like RMSE?
On Thu, Mar 27, 2014 at 9:52 AM, Kyle Kastner wrote:
> This may be an obvious question - but did you try applying a simple
> Hamming, Blackman-Harris, etc. window to the data? Before trying EMD?
>
> Pretty much every transform (FFT included) has edge effect pro
This may be an obvious question - but did you try applying a simple
Hamming, Blackman-Harris, etc. window to the data? Before trying EMD?
Pretty much every transform (FFT included) has edge effect problems if the
signal is not exactly at a periodic boundary, and it sounds like the SVR
prediction w
1 - 100 of 127 matches
Mail list logo