from:"Alexey Pechorin"

Re: Ideas to put a Spark ML model in production

2016-07-03 Thread Alexey Pechorin

>From my personal experience - we're reading the metadata of the features column in the dataframe to extract mapping of the feature indices to the original feature name, and use this mapping to translate the model coefficients into a JSON string that maps the original feature names to their

Re: cache datframe

2016-06-16 Thread Alexey Pechorin

What's the reason for your first cache call? It looks like you've used the data only once to transform it without reusing the data, so there's no reason for the first cache call, and you need only the second call (and that also depends on the rest of your code). On Thu, Jun 16, 2016 at 3:17 PM,