Hi Ayan, I'm not an expert on Spark or on the use of dynamic languages on the JVM. I started yesterday doing a proof of concept for a project. The idea is: - to get some datasets in CSV format (I steel need to check what is the better way to parse a CSV in spark. Any suggestion?) - work this data and creating a collection of objects (case class or a Map) - passing each object to a set of Javascript functions that will calculate a score each. - merging all this generated scores via an arithmetic mean or geometric mean or ... - create a new CSV
Some Guru (not me) will change the Javascripts to better match the business rules at the time. Yesterday I +- did all this work. Just the Javascript is missing, I need to do it today. Best Regards Marcos Rebelo On Wed, May 27, 2015 at 12:44 AM, ayan guha <guha.a...@gmail.com> wrote: > Yes you are in right mailing list, for sure :) > > Regarding your question, I am sure you are well versed with how spark > works. Essentially you can run any arbitrary function with map call and it > will run in remote nodes. Hence you need to install any needed dependency > in all nodes. You can also pass on any additional custom code through jar > files which get shipped to cluster when your function is run. > > This is of course a general idea. In your case, if you can kindly show > what you are doing and any errors then experts here will definitely help. > > Best > Ayan > On 27 May 2015 05:08, "andy petrella" <andy.petre...@gmail.com> wrote: > >> Yop, why not using like you said a js engine le rhino? But then I would >> suggest using mapPartition instead si only one engine per partition. >> Probably broadcasting the script is also a good thing to do. >> >> I guess it's for add hoc transformations passed by a remote client, >> otherwise you could simply convert the js into Scala, right? >> >> HTH >> Andy >> >> Le mar. 26 mai 2015 21:03, marcos rebelo <ole...@gmail.com> a écrit : >> >>> Hi all >>> >>> Let me be clear, I'm speaking of Spark (big data, map/reduce, hadoop, >>> ... related). I have multiple map/flatMap/groupBy and one of the steps >>> needs to be a map passing the item inside a JavaScript code. >>> >>> 2 Questions: >>> - Is this question related to this list? >>> - Did someone do something similar? >>> >>> Best Regards >>> Marcos Rebelo >>> >>> >>> >>> On Tue, May 26, 2015 at 8:03 PM, Marcelo Vanzin <van...@cloudera.com> >>> wrote: >>> >>>> Is it just me or does that look completely unrelated to >>>> Spark-the-Apache-project? >>>> >>>> On Tue, May 26, 2015 at 10:55 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>>> >>>>> Have you looked at https://github.com/spark/sparkjs ? >>>>> >>>>> Cheers >>>>> >>>>> On Tue, May 26, 2015 at 10:17 AM, marcos rebelo <ole...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> My first message on this mailing list: >>>>>> >>>>>> I need to run JavaScript on Spark. Somehow I would like to use the >>>>>> ScriptEngineManager or any other way that makes Rhino do the work for me. >>>>>> >>>>>> Consider that I have a Structure that needs to be changed by a >>>>>> JavaScript. I will have a set of Javascript and depending on the >>>>>> structure >>>>>> I will do some calculation. >>>>>> >>>>>> Did someone make it work and can get me a simple snippet that works? >>>>>> >>>>>> Thanks for any support >>>>>> >>>>>> Best Regards >>>>>> Marcos Rebelo >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Marcelo >>>> >>> >>>