Github user etrain commented on the pull request:

    https://github.com/apache/spark/pull/79#issuecomment-39392123
  
    Hi Hirakendu - thanks for all the detailed suggestions and information. I 
will reply to that separately.
    
    One question - you say there are 500,000 examples and this equates to 90GB 
of raw data. If that's the case, this works out to ~200KB per example - is that 
right or are you off by an order of magnitude in either the number of features 
or the number of data points? Or are we throwing a bunch of data out before 
fitting?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to