Re: Broadcast variables in R
Thank you very much Shivaram. I’ve got it working on Mac now by specifying the namespace. Using SparkR:::parallelize() iso just parallelize() Wkr, Serge On 21 Jul 2015, at 17:20, Shivaram Venkataraman shiva...@eecs.berkeley.edumailto:shiva...@eecs.berkeley.edu wrote: There shouldn't be anything Mac OS specific about this feature. One point of warning though -- As mentioned previously in this thread the APIs were made private because we aren't sure we will be supporting them in the future. If you are using these APIs it would be good to chime in on the JIRA with your use-case Thanks Shivaram On Tue, Jul 21, 2015 at 2:34 AM, Serge Franchois serge.franch...@altran.commailto:serge.franch...@altran.com wrote: I might add to this that I've done the same exercise on Linux (CentOS 6) and there, broadcast variables ARE working. Is this functionality perhaps not exposed on Mac OS X? Or has it to do with the fact there are no native Hadoop libs for Mac? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-in-R-tp23914p23927.html Sent from the Apache Spark User List mailing list archive at Nabble.comhttp://Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.orgmailto:user-h...@spark.apache.org
Re: Broadcast variables in R
There shouldn't be anything Mac OS specific about this feature. One point of warning though -- As mentioned previously in this thread the APIs were made private because we aren't sure we will be supporting them in the future. If you are using these APIs it would be good to chime in on the JIRA with your use-case Thanks Shivaram On Tue, Jul 21, 2015 at 2:34 AM, Serge Franchois serge.franch...@altran.com wrote: I might add to this that I've done the same exercise on Linux (CentOS 6) and there, broadcast variables ARE working. Is this functionality perhaps not exposed on Mac OS X? Or has it to do with the fact there are no native Hadoop libs for Mac? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-in-R-tp23914p23927.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Broadcast variables in R
I might add to this that I've done the same exercise on Linux (CentOS 6) and there, broadcast variables ARE working. Is this functionality perhaps not exposed on Mac OS X? Or has it to do with the fact there are no native Hadoop libs for Mac? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-in-R-tp23914p23927.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Broadcast variables in R
Hi Serge, The broadcast function was made private when SparkR merged into Apache Spark for the 1.4.0 release. You can still use broadcast by specifying the private namespace though. SparkR:::broadcast(sc, obj) The RDD methods were considered very low-level, and the SparkR devs are still figuring out which of them they¹d like to expose along with the higher-level DataFrame API. You can see the rationale for the decision on the project JIRA [1]. [1] -- https://issues.apache.org/jira/browse/SPARK-7230 Hope that helps, Alek On 7/20/15, 12:00 PM, Serge Franchois serge.franch...@altran.com wrote: I've searched high and low to use broadcast variables in R. Is is possible at all? I don't see them mentioned in the SparkR API. Or is there another way of using this feature? I need to share a large amount of data between executors. At the moment, I get warned about my task being too large. I have tried pyspark, and there I can use them. Wkr, Serge -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-in -R-tp23915.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org