Will do, thanks Gael. Enjoy your vacation!
Jake
On Wed, Feb 27, 2013 at 12:12 PM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Wed, Feb 27, 2013 at 11:34:43AM -0800, Jacob Vanderplas wrote:
> > Well, since communication time is limited, I'd be happy to work on a
> proposal
> >
On Wed, Feb 27, 2013 at 11:34:43AM -0800, Jacob Vanderplas wrote:
> Well, since communication time is limited, I'd be happy to work on a proposal
> on my own and put your name on it as well, if you trust me to do that without
> you having a chance to read it. Or will you be back before the March 3
This does require sphinx though, do you think we should make a
downloadable copy available at release time?
On Wed, Feb 27, 2013 at 5:44 PM, Andreas Mueller
wrote:
> On 02/27/2013 03:47 PM, Lars Buitinck wrote:
>> 2013/2/27 Dustin Arendt :
>>> I work at a lab where our research machines are compl
The Patagonia trip, yes?
Well, since communication time is limited, I'd be happy to work on a
proposal on my own and put your name on it as well, if you trust me to do
that without you having a chance to read it. Or will you be back before
the March 30th deadline?
Jake
On Wed, Feb 27, 2013 at
On Wed, Feb 27, 2013 at 10:22:02AM -0800, Jacob Vanderplas wrote:
> Let's wait a bit to hear if others are interested, and then I'll start an
> off-list email chain to discuss ideas.
I'll be probably be gone on vacations by then: I am leaving in less than
24 hours (and completly crushed with thing
Using the sample_weight parameter in the RandomForestClassifier along with the
balance_weights method from the preprocessing module to generate the sample
weights might work as well.
You can check this link for a previous related discussion.
http://sourceforge.net/mailarchive/message.php?msg_id
No, I installed numpy from the official website (with scipy).
2013/2/27 Andreas Mueller
> On 02/27/2013 12:08 PM, Vlad Niculae wrote:
> > The second run ShNaYkHs posted looks like a good install though,
> > despite the test failure.
> Are these your binaries on that website?
> It also has 64bit
No, my binaries are only on sourceforge and pypi.
Vlad
On Wed, Feb 27, 2013 at 5:46 PM, Andreas Mueller
wrote:
> On 02/27/2013 12:08 PM, Vlad Niculae wrote:
>> The second run ShNaYkHs posted looks like a good install though,
>> despite the test failure.
> Are these your binaries on that website?
Thanks for the clarification.
I have to create clusters vis-a-vis a dependent variable. I can't use
forests because I loose the structure. Rules I create from R score 10K
segments a second. About 1 billion a day.
The ideal algo will have the properties of a dtree. Variable selection,
robust a
Gael,
That would be great!
Let's wait a bit to hear if others are interested, and then I'll start an
off-list email chain to discuss ideas.
Jake
On Wed, Feb 27, 2013 at 10:17 AM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Wed, Feb 27, 2013 at 10:04:25AM -0800, Jacob Vanderpla
On Wed, Feb 27, 2013 at 10:04:25AM -0800, Jacob Vanderplas wrote:
> Is anyone planning to submit a scikit-learn tutorial proposal? I'm planning
> to
> attend the conference; I'd be happy to prepare another tutorial myself, or to
> team-teach with someone else who is interested.
I was thinking th
2013/2/27 David Montgomery :
> Oknow I am really confused on how to interpret the tree.
>
> So...I am trying to build a Prob est tree. All of the independent variables
> are categorical and created dummies. What is throwing me off are the <=.
>
> I should have a rule that says e.g. if city=LA
Hi folks,
The call for tutorial & talk proposals for Scipy 2013 is open, and tutorial
proposals are due by the end of March. The themes for Scipy 2013 include
Machine Learning -- see the info here:
http://conference.scipy.org/scipy2013/tutorial_overview.php
I've talked to Francesc, who is the tutor
2013/2/27 David Montgomery :
> Oknow I am really confused on how to interpret the tree.
>
> So...I am trying to build a Prob est tree. All of the independent variables
> are categorical and created dummies. What is throwing me off are the <=.
>
> I should have a rule that says e.g. if city=LA
Oknow I am really confused on how to interpret the tree.
So...I am trying to build a Prob est tree. All of the independent
variables are categorical and created dummies. What is throwing me off are
the <=.
I should have a rule that says e.g. if city=LA,NY and TIME=Noon then .20.
In the cha
On 02/27/2013 12:08 PM, Vlad Niculae wrote:
> The second run ShNaYkHs posted looks like a good install though,
> despite the test failure.
Are these your binaries on that website?
It also has 64bit versions.
They say they require the mkl numpy install.
Shnaykhs: did you install the mkl numpy from t
On 02/27/2013 03:47 PM, Lars Buitinck wrote:
> 2013/2/27 Dustin Arendt :
>> I work at a lab where our research machines are completely isolated from the
>> internet. I was hoping to be able to download a complete version of the
>> scikit-learn htmldoc to host on our internal webserver. However, t
On 02/27/2013 04:48 PM, ShNaYkHs ShNaYkHs wrote:
> Is it possible to get a confidence value or a probability (P(y|x))
> that the class y predicted for a given data-point x is correct ? Using
> any of these classifiers from sklearn: tree, GaussianNB (naive
> bayes), KNeighborsClassifier, svm (svm
Looks good to me - save the output to a file (e.g. foobar.dot) and run
the following command:
$ dot -Tpdf foobar.dot -o foobar.pdf
When I open the pdf all labels are correctly displayed - remember that
they are not indicator features - so the thresholds are usually
"country=AU <= 0.5".
You c
Is it possible to get a confidence value or a probability (P(y|x)) that the
class y predicted for a given data-point x is correct ? Using any of these
classifiers from sklearn: tree, GaussianNB (naive
bayes), KNeighborsClassifier, svm (svm.SVC ..).
--
Thanks I used DictVectorizer()
I am now trying to add lables to the tree graph. Below are the labels and
the digraph Tree. However, I dont see lables on the tree nodes. Did I not
use feature names correct?
measurements = [
{'country':'US','city': 'Dubai'},
{'country':'US','city': 'London'}
2013/2/27 ShNaYkHs ShNaYkHs :
> For the RandomForestClassifier, the target values for training should be
> integers (that correspond to classes in classification). When I specify the
> labels as strings, I get an exception "ValueError: invalid literal for
> float(): aaa". For the other clissifiers
I personally use:
labels_train = np.genfromtxt('dataset.txt', delimiter=',', usecols=0,
dtype=str)
data_train = np.genfromtxt('dataset.txt', delimiter=',')[:,1:]
(Y is labels_train, X is data_train)
2013/2/27 David Montgomery
> Hi,
>
> I have a data structure that looks like this:
>
> 1 NewYo
For the RandomForestClassifier, the target values for training should be
integers (that correspond to classes in classification). When I specify the
labels as strings, I get an exception "ValueError: invalid literal for
float(): aaa". For the other clissifiers (svm, tree, knn, neiveBayes etc) I
can
2013/2/27 Dustin Arendt :
> I work at a lab where our research machines are completely isolated from the
> internet. I was hoping to be able to download a complete version of the
> scikit-learn htmldoc to host on our internal webserver. However, the only
> htmldoc on sourceforge is for the 0.7 ve
Hi,
I work at a lab where our research machines are completely isolated from
the internet. I was hoping to be able to download a complete version of
the scikit-learn htmldoc to host on our internal webserver. However, the
only htmldoc on sourceforge is for the 0.7 version (though the PDF version
Hi David,
I recommend that you load the data using Pandas (``pandas.read_csv``).
Scikit-learn does not support categorical features out-of-the-box; you
need to encode them as dummy variables (aka one-hot encoding) - you
can do this either using ``sklearn.preprocessing.DictVectorizer`` or
via ``pan
Hi,
I have a data structure that looks like this:
1 NewYork 1 6 high
0 LA 3 4 low
...
I am trying to predict probability where Y is column one. The all of the
attributes of the X are categorical and I will use a dtree regression. How
do I load this data into the y and X?
Thanks
--
The second run ShNaYkHs posted looks like a good install though,
despite the test failure.
On Wed, Feb 27, 2013 at 11:06 AM, Vlad Niculae wrote:
> I built the binaries, is this because of the version of numpy I
> compiled against?
>
> On Tue, Feb 26, 2013 at 4:51 PM, ShNaYkHs ShNaYkHs wrote:
>>
I built the binaries, is this because of the version of numpy I
compiled against?
On Tue, Feb 26, 2013 at 4:51 PM, ShNaYkHs ShNaYkHs wrote:
> Now I re-installed numpy, scipy matplotlib and scikit-learn from
> http://www.lfd.uci.edu/~gohlke/pythonlibs/#scikit-learn , I choose the
> versions ending
On Tue, 2013-02-26 at 15:21 +0100, Lars Buitinck wrote:
>
> I'm all in favor of that, but we have so many different estimators
> that special-casing Pipeline for all (kinds of) them is infeasible. So
> we should come up with an elegant and general set of rules, which we
> can then implement by e.
Hi David,
I think you should have a look at sklearn.tree.export_graphviz. It
will generate a picture of the tree for you.
- Reference:
http://scikit-learn.org/dev/modules/generated/sklearn.tree.export_graphviz.html#sklearn.tree.export_graphviz
- Example: http://scikit-learn.org/dev/_images/iris.s
32 matches
Mail list logo