Noticed few things about Spark transformers just wanted to be clear.

Unary transformer:

createTransformFunc: IN => OUT  = { *item* => }
Here *item *is single element and *NOT* entire column.

I would like to get the number of elements in that particular column. Since
there is *no forward checking* how can we get this information ?
We have visibility into single element and not the entire column.










On Sun, Sep 4, 2016 at 9:30 AM, janardhan shetty <janardhan...@gmail.com>
wrote:

> In scala Spark ML Dataframes.
>
> On Sun, Sep 4, 2016 at 9:16 AM, Somasundaram Sekar <somasundar.sekar@
> tigeranalytics.com> wrote:
>
>> Can you try this
>>
>> https://www.linkedin.com/pulse/hive-functions-udfudaf-udtf-
>> examples-gaurav-singh
>>
>> On 4 Sep 2016 9:38 pm, "janardhan shetty" <janardhan...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Is there any chance that we can send entire multiple columns to an udf
>>> and generate a new column for Spark ML.
>>> I see similar approach as VectorAssembler but not able to use few
>>> classes /traitslike HasInputCols, HasOutputCol, DefaultParamsWritable since
>>> they are private.
>>>
>>> Any leads/examples is appreciated in this regard..
>>>
>>> Requirement:
>>> *Input*: Multiple columns of a Dataframe
>>> *Output*:  Single new modified column
>>>
>>
>

Reply via email to