Hi Leon,
When I run your script, I get no instances of NaN in the data.

I wonder if it's a problem with storing the data as a npy file. I asked around last spring and everybody seemed to think that the format is compatible across platforms and numpy versions, but I may be wrong. Does anybody know?
   Jake

On 11/15/2012 07:30 PM, Leon Palafox wrote:
Hello Jake,

The error is easy to reproduce, after downloading the data for the file sdss_photoz via the fetch_data script:

data=np.load('./sklearn_tutorial/doc/data/sdss_photoz/sdss_photoz.npy')
print data.dtype.names
#########################################################
count=0
N=len(data)
X=np.zeros((N,4))
for i in range(N):
    X[i,0]= data['u'][i]-data['g'][i]
    if np.isnan(X[i,0]):
        print data['u'][i],data['g'][i]
        raw_input()

Is just a messy script to loop over each input in the data file and stop if you have a NaN, but you'll see there are many NaN along the way. I haven't tried using the sql script, perhaps I'll do it later that I have some more time. The data with the redshifts also has several NaN along the way.

Note: I downloaded the data several times at 7:00 pm Japan Standard Time, and again this noon at 12:30 JST

Thanks

Leon


On Fri, Nov 16, 2012 at 2:26 AM, Jake Vanderplas <[email protected] <mailto:[email protected]>> wrote:

    Hi Leon,
    I haven't run into any NaN issues, or heard of anyone else having
    that problem.  Can you send the traceback for the specific error
    you're getting?  Thanks
       Jake


    On 11/15/2012 04:14 AM, Jaques Grobler wrote:
    Hi Leon -

    I hadn't encountered this back when I looked at this.
    I think @JacobVanderPlas would perhaps be best with this since he
    put that tutorial together.

    I'm sure he'll be able to help with this.

    ping @jakevp :)

    Regards, J



    2012/11/15 Leon Palafox <[email protected]
    <mailto:[email protected]>>


        Hey Guys,

        I was running the data set in the Tree Regression Example for
        the astroml
        
(http://astroml.github.com/sklearn_tutorial/regression.html#a-simple-method-decision-tree-regression)

        And I bumped with some NaN that come from the dataset.

        Has anyone else encountered this issue, and if so, how did
        you solved it, I think it is important to solve it, since it
        renders those examples unusable.

        Best

-- Leon Palafox, M.Sc
        PhD Candidate
        Iba Laboratory
        +81-3-5841-8436 <tel:%2B81-3-5841-8436>
        University of Tokyo
        Tokyo, Japan.



        
------------------------------------------------------------------------------
        Monitor your physical, virtual and cloud infrastructure from
        a single
        web console. Get in-depth insight into apps, servers,
        databases, vmware,
        SAP, cloud infrastructure, etc. Download 30-day Free Trial.
        Pricing starts from $795 for 25 servers or applications!
        http://p.sf.net/sfu/zoho_dev2dev_nov
        _______________________________________________
        Scikit-learn-general mailing list
        [email protected]
        <mailto:[email protected]>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




    
------------------------------------------------------------------------------
    Monitor your physical, virtual and cloud infrastructure from a single
    web console. Get in-depth insight into apps, servers, databases, vmware,
    SAP, cloud infrastructure, etc. Download 30-day Free Trial.
    Pricing starts from $795 for 25 servers or applications!
    http://p.sf.net/sfu/zoho_dev2dev_nov


    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]  
<mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------
    Monitor your physical, virtual and cloud infrastructure from a single
    web console. Get in-depth insight into apps, servers, databases,
    vmware,
    SAP, cloud infrastructure, etc. Download 30-day Free Trial.
    Pricing starts from $795 for 25 servers or applications!
    http://p.sf.net/sfu/zoho_dev2dev_nov
    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




--
Leon Palafox, M.Sc
PhD Candidate
Iba Laboratory
+81-3-5841-8436
University of Tokyo
Tokyo, Japan.




------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to