Hi,

I am not very sure if SPARK data frames apply to your used case, if it does
please give a try by creating a UDF in Python and check whether you can
call it in Scala or not using select and expr.

Regards,
Gourav Sengupta

On Mon, Jul 16, 2018 at 5:32 AM, Chetan Khatri <chetan.opensou...@gmail.com>
wrote:

> Hello Jayant,
>
> Thanks for great OSS Contribution :)
>
> On Thu, Jul 12, 2018 at 1:36 PM, Jayant Shekhar <jayantbaya...@gmail.com>
> wrote:
>
>> Hello Chetan,
>>
>> Sorry missed replying earlier. You can find some sample code here :
>>
>> http://sparkflows.readthedocs.io/en/latest/user-guide/python
>> /pipe-python.html
>>
>> We will continue adding more there.
>>
>> Feel free to ping me directly in case of questions.
>>
>> Thanks,
>> Jayant
>>
>>
>> On Mon, Jul 9, 2018 at 9:56 PM, Chetan Khatri <
>> chetan.opensou...@gmail.com> wrote:
>>
>>> Hello Jayant,
>>>
>>> Thank you so much for suggestion. My view was to  use Python function as
>>> transformation which can take couple of column names and return object.
>>> which you explained. would that possible to point me to similiar codebase
>>> example.
>>>
>>> Thanks.
>>>
>>> On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <jayantbaya...@gmail.com>
>>> wrote:
>>>
>>>> Hello Chetan,
>>>>
>>>> We have currently done it with .pipe(.py) as Prem suggested.
>>>>
>>>> That passes the RDD as CSV strings to the python script. The python
>>>> script can either process it line by line, create the result and return it
>>>> back. Or create things like Pandas Dataframe for processing and finally
>>>> write the results back.
>>>>
>>>> In the Spark/Scala/Java code, you get an RDD of string, which we
>>>> convert back to a Dataframe.
>>>>
>>>> Feel free to ping me directly in case of questions.
>>>>
>>>> Thanks,
>>>> Jayant
>>>>
>>>>
>>>> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <
>>>> chetan.opensou...@gmail.com> wrote:
>>>>
>>>>> Prem sure, Thanks for suggestion.
>>>>>
>>>>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sparksure...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> try .pipe(.py) on RDD
>>>>>>
>>>>>> Thanks,
>>>>>> Prem
>>>>>>
>>>>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>>>>> chetan.opensou...@gmail.com> wrote:
>>>>>>
>>>>>>> Can someone please suggest me , thanks
>>>>>>>
>>>>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <
>>>>>>> chetan.opensou...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello Dear Spark User / Dev,
>>>>>>>>
>>>>>>>> I would like to pass Python user defined function to Spark Job
>>>>>>>> developed using Scala and return value of that function would be 
>>>>>>>> returned
>>>>>>>> to DF / Dataset API.
>>>>>>>>
>>>>>>>> Can someone please guide me, which would be best approach to do
>>>>>>>> this. Python function would be mostly transformation function. Also 
>>>>>>>> would
>>>>>>>> like to pass Java Function as a String to Spark / Scala job and it 
>>>>>>>> applies
>>>>>>>> to RDD / Data Frame and should return RDD / Data Frame.
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to