Hi Ayan,

I'm not an expert on Spark or on the use of dynamic languages on the JVM. I
started yesterday doing a proof of concept for a project.
The idea is:
  - to get some datasets in CSV format (I steel need to check what is the
better way to parse a CSV in spark. Any suggestion?)
  - work this data and creating a collection of objects (case class or a
Map)
  - passing each object to a set of Javascript functions that will
calculate a score each.
  - merging all this generated scores via an arithmetic mean or geometric
mean or ...
  - create a new CSV

Some Guru (not me) will change the Javascripts to better match the business
rules at the time.

Yesterday I +- did all this work. Just the Javascript is missing, I need to
do it today.

Best Regards
Marcos Rebelo


On Wed, May 27, 2015 at 12:44 AM, ayan guha <guha.a...@gmail.com> wrote:

> Yes you are in right mailing list, for sure :)
>
> Regarding your question, I am sure you are well versed with how spark
> works. Essentially you can run any arbitrary function with map call and it
> will run in remote nodes. Hence you need to install any needed dependency
> in all nodes. You can also pass on any additional custom code through jar
> files which get shipped to cluster when your function is run.
>
> This is of course a general idea. In your case, if you can kindly show
> what you are doing and any errors then experts here will definitely help.
>
> Best
> Ayan
> On 27 May 2015 05:08, "andy petrella" <andy.petre...@gmail.com> wrote:
>
>> Yop, why not using like you said a js engine le rhino? But then I would
>> suggest using mapPartition instead si only one engine per partition.
>> Probably broadcasting the script is also a good thing to do.
>>
>> I guess it's for add hoc transformations passed by a remote client,
>> otherwise you could simply convert the js into Scala, right?
>>
>> HTH
>> Andy
>>
>> Le mar. 26 mai 2015 21:03, marcos rebelo <ole...@gmail.com> a écrit :
>>
>>> Hi all
>>>
>>> Let me be clear, I'm speaking of Spark (big data, map/reduce, hadoop,
>>> ... related). I have multiple map/flatMap/groupBy and one of the steps
>>> needs to be a map passing the item inside a JavaScript code.
>>>
>>> 2 Questions:
>>>  - Is this question related to this list?
>>>  - Did someone do something similar?
>>>
>>> Best Regards
>>> Marcos Rebelo
>>>
>>>
>>>
>>> On Tue, May 26, 2015 at 8:03 PM, Marcelo Vanzin <van...@cloudera.com>
>>> wrote:
>>>
>>>> Is it just me or does that look completely unrelated to
>>>> Spark-the-Apache-project?
>>>>
>>>> On Tue, May 26, 2015 at 10:55 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> Have you looked at https://github.com/spark/sparkjs ?
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Tue, May 26, 2015 at 10:17 AM, marcos rebelo <ole...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> My first message on this mailing list:
>>>>>>
>>>>>> I need to run JavaScript on Spark. Somehow I would like to use the
>>>>>> ScriptEngineManager or any other way that makes Rhino do the work for me.
>>>>>>
>>>>>> Consider that I have a Structure that needs to be changed by a
>>>>>> JavaScript. I will have a set of Javascript and depending on the 
>>>>>> structure
>>>>>> I will do some calculation.
>>>>>>
>>>>>> Did someone make it work and can get me a simple snippet that works?
>>>>>>
>>>>>> Thanks for any support
>>>>>>
>>>>>> Best Regards
>>>>>> Marcos Rebelo
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Marcelo
>>>>
>>>
>>>

Reply via email to