Re: Spark ML Pipeline inaccessible types

zapletal-martin Wed, 25 Mar 2015 04:16:14 -0700

Sean,



thanks for your response. I am familiar with NoSuchMethodException in 
general, but I think it is not the case this time. The code actually 
attempts to get parameter by name using val m = this.getClass.getMethodName
(paramName).




This may be a bug, but it is only a side effect caused by the real problem I
am facing. My issue is that VectorUDT is not accessible by user code and 
therefore it is not possible to use custom ML pipeline with the existing 
Predictors (see the last two paragraphs in my first email).




Best Regards,

Martin



---------- Původní zpráva ----------
Od: Sean Owen <[email protected]>
Komu: [email protected]
Datum: 25. 3. 2015 11:05:54
Předmět: Re: Spark ML Pipeline inaccessible types

"NoSuchMethodError in general means that your runtime and compile-time
environments are different. I think you need to first make sure you
don't have mismatching versions of Spark.

On Wed, Mar 25, 2015 at 11:00 AM, <[email protected]> wrote:
> Hi,
>
> I have started implementing a machine learning pipeline using Spark 1.3.0
> and the new pipelining API and DataFrames. I got to a point where I have 
my
> training data set prepared using a sequence of Transformers, but I am
> struggling to actually train a model and use it for predictions.
>
> I am getting a java.lang.NoSuchMethodException:
> org.apache.spark.ml.regression.LinearRegression.myFeaturesColumnName()
> exception thrown at checkInputColumn method in Params trait when using a
> Predictor (LinearRegression in my case, but that should not matter). This
> looks like a bug - the exception is thrown when executing getParam
(colName)
> when the require(actualDataType.equals(datatype), ...) requirement is not
> met so the expected requirement failed exception is not thrown and is 
hidden
> by the unexpected NoSuchMethodException instead. I can raise a bug if this
> really is an issue and I am not using something incorrectly.
>
> The problem I am facing however is that the Predictor expects features to
> have VectorUDT type as defined in Predictor class (protected def
> featuresDataType: DataType = new VectorUDT). But since this type is
> private[spark] my Transformer can not prepare features with this type 
which
> then correctly results in the exception above when I use a different type.
>
> Is there a way to define a custom Pipeline that would be able to use the
> existing Predictors without having to bypass the access modifiers or
> reimplement something or is the pipelining API not yet expected to be used
> in this way?
>
> Thanks,
> Martin
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]"

Re: Spark ML Pipeline inaccessible types

Reply via email to