Hi Guys, A few questions as I progress through my ML learning journey with Ignite...
- I assume that I would start by extracting features from my JSON records in a cache into a vectorizer - how does this impact memory usage? Will origin cache records be moved to disk, as more memory is required than is available for the data in the vectorizer? Or will the vectorizer data begin to use swap? Or will I get OOM exceptions? - Are there any built-in algorithms or recommended strategies for sampling? - Are there any dataset statistical functions like those provided by Python's ML libraries, for high-level evaluation of specific features in a dataset (to assess things like missing-data, cardinality, min-max, mean, mode, standard-deviation, percentiles, etc)? - Is there any doc/video tutorial that would provide a guide for the complete workflow pipeline for an ML example (encompassing the abovementioned operations)? Thanks, Jose -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/