Re: Can I add a new method to RDD class?

2016-12-07 Thread Teng Long
Thank you so much, Mark and everyone else responded for putting up with my 
ignorance.

> On Dec 7, 2016, at 2:50 PM, Mark Hamstra [via Apache Spark Developers List] 
> <ml-node+s1001551n20164...@n3.nabble.com> wrote:
> 
> The easiest way is probably with:
> 
> mvn versions:set -DnewVersion=your_new_version
> 
> On Wed, Dec 7, 2016 at 11:31 AM, Teng Long <[hidden email] 
> > wrote:
> Hi Holden,
> 
> Can you please tell me how to edit version numbers efficiently? the correct 
> way? I'm really struggling with this and don't know where to look.
> 
> Thanks,
> Teng
> 
> 
> On Dec 6, 2016, at 4:02 PM, Teng Long <[hidden email] 
> > wrote:
> 
>> Hi Jakob, 
>> 
>> It seems like I’ll have to either replace the version with my custom version 
>> in all the pom.xml files in every subdirectory that has one and publish 
>> locally, or keep the version (i.e. 2.0.2) and manually remove the spark 
>> repository cache in ~/.ivy2 and ~/.m2 and publish spark locally, then 
>> compile my application with the correct version respectively to make it 
>> work. I think there has to be an elegant way to do this. 
>> 
>>> On Dec 6, 2016, at 1:07 PM, Jakob Odersky-2 [via Apache Spark Developers 
>>> List] <[hidden email] <http://user/SendEmail.jtp?type=node=20157=0>> 
>>> wrote:
>>> 
>>> Yes, I think changing the  property (line 29) in spark's root 
>>> pom.xml should be sufficient. However, keep in mind that you'll also 
>>> need to publish spark locally before you can access it in your test 
>>> application. 
>>> 
>>> On Tue, Dec 6, 2016 at 2:50 AM, Teng Long <>> href="x-msg://50/user/SendEmail.jtp?type=nodenode=20151i=0" 
>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>>> 
>>> > Thank you Jokob for clearing things up for me. 
>>> > 
>>> > Before, I thought my application was compiled against my local build 
>>> > since I 
>>> > can get all the logs I just added in spark-core. But it was all along 
>>> > using 
>>> > spark downloaded from remote maven repository, and that’s why I “cannot" 
>>> > add 
>>> > new RDD methods in. 
>>> > 
>>> > How can I specify a custom version? modify version numbers in all the 
>>> > pom.xml file? 
>>> > 
>>> > 
>>> >
>>> > On Dec 5, 2016, at 9:12 PM, Jakob Odersky <>> > href="x-msg://50/user/SendEmail.jtp?type=nodenode=20151i=1" 
>>> > target="_top" rel="nofollow" link="external" class="">[hidden email]> 
>>> > wrote: 
>>> > 
>>> > m rdds in an "org.apache.spark" package as well 
>>> > 
>>> >
>>> - 
>>> To unsubscribe e-mail: >> href="x-msg://50/user/SendEmail.jtp?type=nodenode=20151i=2" 
>>> target="_top" rel="nofollow" link="external" class="">[hidden email] 
>>> 
>>> 
>>> 
>>> If you reply to this email, your message will be added to the discussion 
>>> below:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20151.html
>>>  
>>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20151.html>
>>> To unsubscribe from Can I add a new method to RDD class?, click here <>.
>>> NAML 
>>> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>> 
>> View this message in context: Re: Can I add a new method to RDD class? 
>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20157.html>
>> Sent from the Apache Spark Developers List mailing list archive 
>> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at Nabble.com 
>> <http://nabble.com/>.
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-developers-list.1001551.n3.nabble

Re: Can I add a new method to RDD class?

2016-12-07 Thread Mark Hamstra
The easiest way is probably with:

mvn versions:set -DnewVersion=your_new_version

On Wed, Dec 7, 2016 at 11:31 AM, Teng Long <longteng...@gmail.com> wrote:

> Hi Holden,
>
> Can you please tell me how to edit version numbers efficiently? the
> correct way? I'm really struggling with this and don't know where to look.
>
> Thanks,
> Teng
>
>
> On Dec 6, 2016, at 4:02 PM, Teng Long <longteng...@gmail.com> wrote:
>
> Hi Jakob,
>
> It seems like I’ll have to either replace the version with my custom
> version in all the pom.xml files in every subdirectory that has one and
> publish locally, or keep the version (i.e. 2.0.2) and manually remove the
> spark repository cache in ~/.ivy2 and ~/.m2 and publish spark locally, then
> compile my application with the correct version respectively to make it
> work. I think there has to be an elegant way to do this.
>
> On Dec 6, 2016, at 1:07 PM, Jakob Odersky-2 [via Apache Spark Developers
> List] <[hidden email]
> <http:///user/SendEmail.jtp?type=node=20157=0>> wrote:
>
> Yes, I think changing the  property (line 29) in spark's root
> pom.xml should be sufficient. However, keep in mind that you'll also
> need to publish spark locally before you can access it in your test
> application.
>
> On Tue, Dec 6, 2016 at 2:50 AM, Teng Long < rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> > Thank you Jokob for clearing things up for me.
> >
> > Before, I thought my application was compiled against my local build
> since I
> > can get all the logs I just added in spark-core. But it was all along
> using
> > spark downloaded from remote maven repository, and that’s why I “cannot"
> add
> > new RDD methods in.
> >
> > How can I specify a custom version? modify version numbers in all the
> > pom.xml file?
> >
> >
> >
> > On Dec 5, 2016, at 9:12 PM, Jakob Odersky < rel="nofollow" link="external" class="">[hidden email]> wrote:
> >
> > m rdds in an "org.apache.spark" package as well
> >
> >
> -------------------------
> To unsubscribe e-mail:  rel="nofollow" link="external" class="">[hidden email]
>
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-
> method-to-RDD-class-tp20100p20151.html
> To unsubscribe from Can I add a new method to RDD class?, click here.
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
> --
> View this message in context: Re: Can I add a new method to RDD class?
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20157.html>
> Sent from the Apache Spark Developers List mailing list archive
> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at Nabble.com
> .
>
>


Re: Can I add a new method to RDD class?

2016-12-06 Thread Teng Long
Hi Jakob, 

It seems like I’ll have to either replace the version with my custom version in 
all the pom.xml files in every subdirectory that has one and publish locally, 
or keep the version (i.e. 2.0.2) and manually remove the spark repository cache 
in ~/.ivy2 and ~/.m2 and publish spark locally, then compile my application 
with the correct version respectively to make it work. I think there has to be 
an elegant way to do this. 

> On Dec 6, 2016, at 1:07 PM, Jakob Odersky-2 [via Apache Spark Developers 
> List] <ml-node+s1001551n20151...@n3.nabble.com> wrote:
> 
> Yes, I think changing the  property (line 29) in spark's root 
> pom.xml should be sufficient. However, keep in mind that you'll also 
> need to publish spark locally before you can access it in your test 
> application. 
> 
> On Tue, Dec 6, 2016 at 2:50 AM, Teng Long <[hidden email] 
> > wrote:
> 
> > Thank you Jokob for clearing things up for me. 
> > 
> > Before, I thought my application was compiled against my local build since 
> > I 
> > can get all the logs I just added in spark-core. But it was all along using 
> > spark downloaded from remote maven repository, and that’s why I “cannot" 
> > add 
> > new RDD methods in. 
> > 
> > How can I specify a custom version? modify version numbers in all the 
> > pom.xml file? 
> > 
> > 
> > 
> > On Dec 5, 2016, at 9:12 PM, Jakob Odersky <[hidden email] 
> > > wrote: 
> > 
> > m rdds in an "org.apache.spark" package as well 
> > 
> >
> 
> - 
> To unsubscribe e-mail: [hidden email] 
>  
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20151.html
>  
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20151.html>
> To unsubscribe from Can I add a new method to RDD class?, click here 
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=20100=bG9uZ3RlbmcuY3FAZ21haWwuY29tfDIwMTAwfC0xNzQ1MzUzNzE=>.
> NAML 
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20157.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Can I add a new method to RDD class?

2016-12-06 Thread Jakob Odersky
Yes, I think changing the  property (line 29) in spark's root
pom.xml should be sufficient. However, keep in mind that you'll also
need to publish spark locally before you can access it in your test
application.

On Tue, Dec 6, 2016 at 2:50 AM, Teng Long  wrote:
> Thank you Jokob for clearing things up for me.
>
> Before, I thought my application was compiled against my local build since I
> can get all the logs I just added in spark-core. But it was all along using
> spark downloaded from remote maven repository, and that’s why I “cannot" add
> new RDD methods in.
>
> How can I specify a custom version? modify version numbers in all the
> pom.xml file?
>
>
>
> On Dec 5, 2016, at 9:12 PM, Jakob Odersky  wrote:
>
> m rdds in an "org.apache.spark" package as well
>
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Can I add a new method to RDD class?

2016-12-06 Thread Teng Long
Thank you Jokob for clearing things up for me. 

Before, I thought my application was compiled against my local build since I 
can get all the logs I just added in spark-core. But it was all along using 
spark downloaded from remote maven repository, and that’s why I “cannot" add 
new RDD methods in. 

How can I specify a custom version? modify version numbers in all the pom.xml 
file?

 
> On Dec 5, 2016, at 9:12 PM, Jakob Odersky  wrote:
> 
> m rdds in an "org.apache.spark" package as well 



Re: Can I add a new method to RDD class?

2016-12-05 Thread Jakob Odersky
It looks like you're having issues with including your custom spark
version (with the extensions) in your test project. To use your local
spark version:
1) make sure it has a custom version (let's call it 2.1.0-CUSTOM)
2) publish it to your local machine with `sbt publishLocal`
3) include the modified version of spark in your test project with
`libraryDependencies += "org.apache.spark" %% "spark-core" %
"2.1.0-CUSTOM"`

However, as others have said, it can be quite a lot of work to
maintain a custom fork of spark. If you're planning on contributing
these changes back to spark, than forking is the way to go (although I
would recommend to keep an ongoing discussion with the maintainers, to
make sure your work will be merged back). Otherwise, I would recommend
to use "implicit extensions" to enrich your rdds instead. An easy
workaround to access spark-private fields is to simply define your
custom rdds in an "org.apache.spark" package as well ;)

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Can I add a new method to RDD class?

2016-12-05 Thread Teng Long
Tarun,

I want to access some private methods e.g. withScope, so I add a similar 
implicit class compiled with spark. But I can’t import that into my 
application? 

for example in I have added org/apache/spark/rdd/RDDExtensions.scala, in there 
I defined a implicit class inside the RDDExtensions object, and I successfully 
compiled spark with it.

Then in my application code, when I try to import that implicit class by using,

import org.apache.spark.rdd.RDDExtensions._

I can’t compile my application, and error says "object RDDExtensions is not a 
member of package org.apache.spark.rdd”. It seems like my import statement is 
wrong, but I don’t know how?

Thanks!

> On Dec 5, 2016, at 5:14 PM, Teng Long <longteng...@gmail.com> wrote:
> 
> I’m trying to implement a transformation that can merge partitions (to align 
> with GPU specs) and move them onto GPU memory, for example rdd.toGPU() and 
> later transformations like map can automatically be performed on GPU. And 
> another transformation rdd.offGPU() to move partitions off GPU memory and 
> repartition them to the way they were on CPU before.
> 
> Thank you, Tarun, for creating that gist. I’ll look at it and see if it meets 
> my needs.
> 
>> On Dec 5, 2016, at 5:07 PM, Tarun Kumar <tarunk1...@gmail.com> wrote:
>> 
>> Teng,
>> 
>> Can you please share the details of transformation that you want to 
>> implement in your method foo?
>> 
>> I have created a gist of one dummy transformation for your method foo , this 
>> foo method transforms from an RDD[T] to RDD[(T,T)]. Many such more 
>> transformations can easily be achieved.
>> 
>> https://gist.github.com/fidato13/3b46fe1c96b37ae0dd80c275fbe90e92
>> 
>> Thanks
>> Tarun Kumar
>> 
>> On 5 December 2016 at 22:33, Thakrar, Jayesh <jthak...@conversantmedia.com> 
>> wrote:
>> Teng,
>> 
>>  
>> 
>> Before you go down creating your own custom Spark system, do give some 
>> thought to what Holden and others are suggesting, viz. using implicit 
>> methods.
>> 
>>  
>> 
>> If you want real concrete examples, have a look at the Spark Cassandra 
>> Connector -
>> 
>>  
>> 
>> Here you will see an example of "extending" SparkContext - 
>> https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md
>> 
>>  
>> 
>> // validation is deferred, so it is not triggered during rdd creation
>> 
>> val rdd = sc.cassandraTable[SomeType]("ks", "not_existing_table")
>> 
>> val emptyRDD = rdd.toEmptyCassandraRDD
>> 
>>  
>> 
>> val emptyRDD2 = sc.emptyCassandraTable[SomeType]("ks", "not_existing_table"))
>> 
>>  
>> 
>>  
>> 
>> And here you will se an example of "extending" RDD - 
>> https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md
>> 
>>  
>> 
>> case class WordCount(word: String, count: Long)
>> 
>> val collection = sc.parallelize(Seq(WordCount("dog", 50), WordCount("cow", 
>> 60)))
>> 
>> collection.saveToCassandra("test", "words", SomeColumns("word", "count"))
>> 
>>  
>> 
>> Hope that helps…
>> 
>> Jayesh
>> 
>>  
>> 
>>  
>> 
>> From: Teng Long <longteng...@gmail.com>
>> Date: Monday, December 5, 2016 at 3:04 PM
>> To: Holden Karau <hol...@pigscanfly.ca>, <dev@spark.apache.org>
>> Subject: Re: Can I add a new method to RDD class?
>> 
>>  
>> 
>> Thank you for providing another answer, Holden.
>> 
>>  
>> 
>> So I did what Tarun and Michal suggested, and it didn’t work out as I want 
>> to have a new transformation method in RDD class, and need to use that RDD’s 
>> spark context which is private. So I guess the only thing I can do now is to 
>> sbt publishLocal?
>> 
>>  
>> 
>> On Dec 5, 2016, at 9:19 AM, Holden Karau <hol...@pigscanfly.ca> wrote:
>> 
>>  
>> 
>> Doing that requires publishing a custom version of Spark, you can edit the 
>> version number do do a publishLocal - but maintaining that change is going 
>> to be difficult. The other approaches suggested are probably better, but 
>> also does your method need to be defined on the RDD class? Could you instead 
>> make a helper object or class to expose whatever functionality you need?
>> 
>>  
>> 
>> On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com> wrote:
>> 
>>

Re: Can I add a new method to RDD class?

2016-12-05 Thread Teng Long
I’m trying to implement a transformation that can merge partitions (to align 
with GPU specs) and move them onto GPU memory, for example rdd.toGPU() and 
later transformations like map can automatically be performed on GPU. And 
another transformation rdd.offGPU() to move partitions off GPU memory and 
repartition them to the way they were on CPU before.

Thank you, Tarun, for creating that gist. I’ll look at it and see if it meets 
my needs.

> On Dec 5, 2016, at 5:07 PM, Tarun Kumar <tarunk1...@gmail.com> wrote:
> 
> Teng,
> 
> Can you please share the details of transformation that you want to implement 
> in your method foo?
> 
> I have created a gist of one dummy transformation for your method foo , this 
> foo method transforms from an RDD[T] to RDD[(T,T)]. Many such more 
> transformations can easily be achieved.
> 
> https://gist.github.com/fidato13/3b46fe1c96b37ae0dd80c275fbe90e92 
> <https://gist.github.com/fidato13/3b46fe1c96b37ae0dd80c275fbe90e92>
> 
> Thanks
> Tarun Kumar
> 
> On 5 December 2016 at 22:33, Thakrar, Jayesh <jthak...@conversantmedia.com 
> <mailto:jthak...@conversantmedia.com>> wrote:
> Teng,
> 
>  
> 
> Before you go down creating your own custom Spark system, do give some 
> thought to what Holden and others are suggesting, viz. using implicit methods.
> 
>  
> 
> If you want real concrete examples, have a look at the Spark Cassandra 
> Connector -
> 
>  
> 
> Here you will see an example of "extending" SparkContext - 
> https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md
>  
> <https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md>
>  
> 
> // validation is deferred, so it is not triggered during rdd creation
> 
> val rdd = sc.cassandraTable[SomeType]("ks", "not_existing_table")
> 
> val emptyRDD = rdd.toEmptyCassandraRDD
> 
>  
> 
> val emptyRDD2 = sc.emptyCassandraTable[SomeType]("ks", "not_existing_table"))
> 
>  
> 
>  
> 
> And here you will se an example of "extending" RDD - 
> https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md
>  
> <https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md>
>  
> 
> case class WordCount(word: String, count: Long)
> 
> val collection = sc.parallelize(Seq(WordCount("dog", 50), WordCount("cow", 
> 60)))
> 
> collection.saveToCassandra("test", "words", SomeColumns("word", "count"))
> 
>  
> 
> Hope that helps…
> 
> Jayesh
> 
>  
> 
>  
> 
> From: Teng Long <longteng...@gmail.com <mailto:longteng...@gmail.com>>
> Date: Monday, December 5, 2016 at 3:04 PM
> To: Holden Karau <hol...@pigscanfly.ca <mailto:hol...@pigscanfly.ca>>, 
> <dev@spark.apache.org <mailto:dev@spark.apache.org>>
> Subject: Re: Can I add a new method to RDD class?
> 
>  
> 
> Thank you for providing another answer, Holden.
> 
>  
> 
> So I did what Tarun and Michal suggested, and it didn’t work out as I want to 
> have a new transformation method in RDD class, and need to use that RDD’s 
> spark context which is private. So I guess the only thing I can do now is to 
> sbt publishLocal?
> 
>  
> 
> On Dec 5, 2016, at 9:19 AM, Holden Karau <hol...@pigscanfly.ca 
> <mailto:hol...@pigscanfly.ca>> wrote:
> 
>  
> 
> Doing that requires publishing a custom version of Spark, you can edit the 
> version number do do a publishLocal - but maintaining that change is going to 
> be difficult. The other approaches suggested are probably better, but also 
> does your method need to be defined on the RDD class? Could you instead make 
> a helper object or class to expose whatever functionality you need?
> 
>  
> 
> On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com 
> <mailto:longteng...@gmail.com>> wrote:
> 
> Thank you very much! But why can’t I just add new methods in to the source 
> code of RDD?
> 
>  
> 
> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers List] 
> <[hidden email] <http://user/SendEmail.jtp?type=node=20107=0>> wrote:
> 
>  
> 
> A simple Scala example of implicit classes:
> 
> implicit class EnhancedString(str: String) {
>   def prefix(prefix: String) = prefix + str
> }
>  
> println("World".prefix("Hello "))
> As Tarun said, you have to import it if it's not in the same class where you 
> use it.
> 
> Hope this makes it clearer,
> 
> Michal Senkyr
> 
>  
> 
> On 5.12.2016 

Re: Can I add a new method to RDD class?

2016-12-05 Thread Tarun Kumar
Teng,

Can you please share the details of transformation that you want to
implement in your method foo?

I have created a gist of one dummy transformation for your method foo ,
this foo method transforms from an RDD[T] to RDD[(T,T)]. Many such more
transformations can easily be achieved.

https://gist.github.com/fidato13/3b46fe1c96b37ae0dd80c275fbe90e92

Thanks
Tarun Kumar

On 5 December 2016 at 22:33, Thakrar, Jayesh <jthak...@conversantmedia.com>
wrote:

> Teng,
>
>
>
> Before you go down creating your own custom Spark system, do give some
> thought to what Holden and others are suggesting, viz. using implicit
> methods.
>
>
>
> If you want real concrete examples, have a look at the Spark Cassandra
> Connector -
>
>
>
> Here you will see an example of "extending" SparkContext -
> https://github.com/datastax/spark-cassandra-connector/
> blob/master/doc/2_loading.md
>
>
>
> // validation is deferred, so it is not triggered during rdd creation
>
> val rdd = sc.cassandraTable[SomeType]("ks", "not_existing_table")
>
> val emptyRDD = rdd.toEmptyCassandraRDD
>
>
>
> val emptyRDD2 = sc.emptyCassandraTable[SomeType]("ks",
> "not_existing_table"))
>
>
>
>
>
> And here you will se an example of "extending" RDD -
> https://github.com/datastax/spark-cassandra-connector/
> blob/master/doc/5_saving.md
>
>
>
> case class WordCount(word: String, count: Long)
>
> val collection = sc.parallelize(Seq(WordCount("dog", 50),
> WordCount("cow", 60)))
>
> collection.saveToCassandra("test", "words", SomeColumns("word", "count"))
>
>
>
> Hope that helps…
>
> Jayesh
>
>
>
>
>
> *From: *Teng Long <longteng...@gmail.com>
> *Date: *Monday, December 5, 2016 at 3:04 PM
> *To: *Holden Karau <hol...@pigscanfly.ca>, <dev@spark.apache.org>
> *Subject: *Re: Can I add a new method to RDD class?
>
>
>
> Thank you for providing another answer, Holden.
>
>
>
> So I did what Tarun and Michal suggested, and it didn’t work out as I want
> to have a new transformation method in RDD class, and need to use that
> RDD’s spark context which is private. So I guess the only thing I can do
> now is to sbt publishLocal?
>
>
>
> On Dec 5, 2016, at 9:19 AM, Holden Karau <hol...@pigscanfly.ca> wrote:
>
>
>
> Doing that requires publishing a custom version of Spark, you can edit the
> version number do do a publishLocal - but maintaining that change is going
> to be difficult. The other approaches suggested are probably better, but
> also does your method need to be defined on the RDD class? Could you
> instead make a helper object or class to expose whatever functionality you
> need?
>
>
>
> On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com> wrote:
>
> Thank you very much! But why can’t I just add new methods in to the source
> code of RDD?
>
>
>
> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers
> List] <[hidden email] <http://user/SendEmail.jtp?type=node=20107=0>>
> wrote:
>
>
>
> A simple Scala example of implicit classes:
>
> implicit class EnhancedString(str: String) {
>
>   def prefix(prefix: String) = prefix + str
>
> }
>
>
>
> println("World".prefix("Hello "))
>
> As Tarun said, you have to import it if it's not in the same class where
> you use it.
>
> Hope this makes it clearer,
>
> Michal Senkyr
>
>
>
> On 5.12.2016 07:43, Tarun Kumar wrote:
>
> Not sure if that's documented in terms of Spark but this is a fairly
> common pattern in scala known as "pimp my library" pattern, you can easily
> find many generic example of using this pattern. If you want I can quickly
> cook up a short conplete example with rdd(although there is nothing really
> more to my example in earlier mail) ? Thanks Tarun Kumar
>
>
>
> On Mon, 5 Dec 2016 at 7:15 AM, long < rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> So is there documentation of this I can refer to?
>
>
>
> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List]
> <[hidden email] <http://user/SendEmail.jtp?type=node=20104=0>>
> wrote:
>
>
>
> Hi Tenglong, In addition to trsell's reply, you can add any method to an
> rdd without making changes to spark code. This can be achieved by using
> implicit class in your own client code: implicit class extendRDD[T](rdd:
> RDD[T]){ def foo() } Then you basically nees to import this implicit class
> in scope 

Re: Can I add a new method to RDD class?

2016-12-05 Thread Thakrar, Jayesh
Teng,

Before you go down creating your own custom Spark system, do give some thought 
to what Holden and others are suggesting, viz. using implicit methods.

If you want real concrete examples, have a look at the Spark Cassandra 
Connector -

Here you will see an example of "extending" SparkContext - 
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md

// validation is deferred, so it is not triggered during rdd creation
val rdd = sc.cassandraTable[SomeType]("ks", "not_existing_table")
val emptyRDD = rdd.toEmptyCassandraRDD

val emptyRDD2 = sc.emptyCassandraTable[SomeType]("ks", "not_existing_table"))


And here you will se an example of "extending" RDD - 
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md

case class WordCount(word: String, count: Long)
val collection = sc.parallelize(Seq(WordCount("dog", 50), WordCount("cow", 60)))
collection.saveToCassandra("test", "words", SomeColumns("word", "count"))

Hope that helps…
Jayesh


From: Teng Long <longteng...@gmail.com>
Date: Monday, December 5, 2016 at 3:04 PM
To: Holden Karau <hol...@pigscanfly.ca>, <dev@spark.apache.org>
Subject: Re: Can I add a new method to RDD class?

Thank you for providing another answer, Holden.

So I did what Tarun and Michal suggested, and it didn’t work out as I want to 
have a new transformation method in RDD class, and need to use that RDD’s spark 
context which is private. So I guess the only thing I can do now is to sbt 
publishLocal?

On Dec 5, 2016, at 9:19 AM, Holden Karau 
<hol...@pigscanfly.ca<mailto:hol...@pigscanfly.ca>> wrote:

Doing that requires publishing a custom version of Spark, you can edit the 
version number do do a publishLocal - but maintaining that change is going to 
be difficult. The other approaches suggested are probably better, but also does 
your method need to be defined on the RDD class? Could you instead make a 
helper object or class to expose whatever functionality you need?

On Mon, Dec 5, 2016 at 6:06 PM long 
<longteng...@gmail.com<mailto:longteng...@gmail.com>> wrote:
Thank you very much! But why can’t I just add new methods in to the source code 
of RDD?

On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers List] 
<[hidden email]<http://user/SendEmail.jtp?type=node=20107=0>> wrote:


A simple Scala example of implicit classes:

implicit class EnhancedString(str: String) {

  def prefix(prefix: String) = prefix + str

}



println("World".prefix("Hello "))

As Tarun said, you have to import it if it's not in the same class where you 
use it.

Hope this makes it clearer,

Michal Senkyr

On 5.12.2016 07:43, Tarun Kumar wrote:
Not sure if that's documented in terms of Spark but this is a fairly common 
pattern in scala known as "pimp my library" pattern, you can easily find many 
generic example of using this pattern. If you want I can quickly cook up a 
short conplete example with rdd(although there is nothing really more to my 
example in earlier mail) ? Thanks Tarun Kumar

On Mon, 5 Dec 2016 at 7:15 AM, long <[hidden email]> wrote:
So is there documentation of this I can refer to?

On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List] 
<[hidden email]<http://user/SendEmail.jtp?type=node=20104=0>> wrote:

Hi Tenglong, In addition to trsell's reply, you can add any method to an rdd 
without making changes to spark code. This can be achieved by using implicit 
class in your own client code: implicit class extendRDD[T](rdd: RDD[T]){ def 
foo() } Then you basically nees to import this implicit class in scope where 
you want to use the new foo method. Thanks Tarun Kumar

On Mon, 5 Dec 2016 at 6:59 AM, <x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0" 
target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:

How does your application fetch the spark dependency? Perhaps list your project 
dependencies and check it's using your dev build.

On Mon, 5 Dec 2016, 08:47 tenglong, <x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=1" 
target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
Hi,

Apparently, I've already tried adding a new method to RDD,

for example,

class RDD {
  def foo() // this is the one I added

  def map()

  def collect()
}

I can build Spark successfully, but I can't compile my application code
which calls rdd.foo(), and the error message says

value foo is not a member of org.apache.spark.rdd.RDD[String]

So I am wondering if there is any mechanism prevents me from doing this or
something I'm doing wrong?




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/C

Re: Can I add a new method to RDD class?

2016-12-05 Thread Teng Long
Thank you, Ryan. Didn’t there is a method for that!

> On Dec 5, 2016, at 4:10 PM, Shixiong(Ryan) Zhu <shixi...@databricks.com> 
> wrote:
> 
> RDD.sparkContext is public: 
> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD@sparkContext:org.apache.spark.SparkContext
>  
> <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD@sparkContext:org.apache.spark.SparkContext>
> 
> On Mon, Dec 5, 2016 at 1:04 PM, Teng Long <longteng...@gmail.com 
> <mailto:longteng...@gmail.com>> wrote:
> Thank you for providing another answer, Holden.
> 
> So I did what Tarun and Michal suggested, and it didn’t work out as I want to 
> have a new transformation method in RDD class, and need to use that RDD’s 
> spark context which is private. So I guess the only thing I can do now is to 
> sbt publishLocal?
> 
>> On Dec 5, 2016, at 9:19 AM, Holden Karau <hol...@pigscanfly.ca 
>> <mailto:hol...@pigscanfly.ca>> wrote:
>> 
>> Doing that requires publishing a custom version of Spark, you can edit the 
>> version number do do a publishLocal - but maintaining that change is going 
>> to be difficult. The other approaches suggested are probably better, but 
>> also does your method need to be defined on the RDD class? Could you instead 
>> make a helper object or class to expose whatever functionality you need?
>> 
>> On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com 
>> <mailto:longteng...@gmail.com>> wrote:
>> Thank you very much! But why can’t I just add new methods in to the source 
>> code of RDD?
>> 
>> 
>>> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers 
>>> List] <[hidden email] <http://user/SendEmail.jtp?type=node=20107=0>> 
>>> wrote:
>>> 
>> 
>>> A simple Scala example of implicit classes:
>>> 
>>> implicit class EnhancedString(str: String) {
>>>   def prefix(prefix: String) = prefix + str
>>> }
>>> 
>>> println("World".prefix("Hello "))
>>> As Tarun said, you have to import it if it's not in the same class where 
>>> you use it.
>>> 
>>> Hope this makes it clearer,
>>> 
>>> Michal Senkyr
>>> 
>>> 
>>> On 5.12.2016 07:43, Tarun Kumar wrote:
>> 
>>>> Not sure if that's documented in terms of Spark but this is a fairly 
>>>> common pattern in scala known as "pimp my library" pattern, you can easily 
>>>> find many generic example of using this pattern.
>>>> 
>>>> If you want I can quickly cook up a short conplete example with 
>>>> rdd(although there is nothing really more to my example in earlier mail) ?
>>>> 
>>>> Thanks 
>>>> Tarun Kumar
>>>> 
>> 
>>>> On Mon, 5 Dec 2016 at 7:15 AM, long <>>> href="x-msg://22/user/SendEmail.jtp?type=nodenode=20106i=0 <>" 
>>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> 
>>>> wrote:
>> 
>>>> So is there documentation of this I can refer to? 
>>>> 
>>>>> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers 
>>>>> List] <[hidden email] 
>>>>> <http://user/SendEmail.jtp?type=node=20104=0>> wrote:
>>>>> 
>>>> 
>>>>> Hi Tenglong,
>>>>> 
>>>>> In addition to trsell's reply, you can add any method to an rdd without 
>>>>> making changes to spark code.
>>>>> 
>>>>> This can be achieved by using implicit class in your own client code:
>>>>> 
>>>>> implicit class extendRDD[T](rdd: RDD[T]){
>>>>> 
>>>>>  def foo()
>>>>> 
>>>>> }
>>>>> 
>>>>> Then you basically nees to import this implicit class in scope where you 
>>>>> want to use the new foo method.
>>>>> 
>>>>> Thanks
>>>>> Tarun Kumar 
>>>>> 
>> 
>>>>> On Mon, 5 Dec 2016 at 6:59 AM, <>>>>  <>" 
>>>>> class="">x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0 
>>>>> <>" target="_top" rel="nofollow" link="external" class="">[hidden email]> 
>>>>> wrote:
>>>> 
>>>>> How does your ap

Re: Can I add a new method to RDD class?

2016-12-05 Thread Teng Long
Thank you for providing another answer, Holden.

So I did what Tarun and Michal suggested, and it didn’t work out as I want to 
have a new transformation method in RDD class, and need to use that RDD’s spark 
context which is private. So I guess the only thing I can do now is to sbt 
publishLocal?

> On Dec 5, 2016, at 9:19 AM, Holden Karau <hol...@pigscanfly.ca> wrote:
> 
> Doing that requires publishing a custom version of Spark, you can edit the 
> version number do do a publishLocal - but maintaining that change is going to 
> be difficult. The other approaches suggested are probably better, but also 
> does your method need to be defined on the RDD class? Could you instead make 
> a helper object or class to expose whatever functionality you need?
> 
> On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com 
> <mailto:longteng...@gmail.com>> wrote:
> Thank you very much! But why can’t I just add new methods in to the source 
> code of RDD?
> 
> 
>> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers List] 
>> <[hidden email] <http://user/SendEmail.jtp?type=node=20107=0>> wrote:
>> 
> 
>> A simple Scala example of implicit classes:
>> 
>> implicit class EnhancedString(str: String) {
>>   def prefix(prefix: String) = prefix + str
>> }
>> 
>> println("World".prefix("Hello "))
>> As Tarun said, you have to import it if it's not in the same class where you 
>> use it.
>> 
>> Hope this makes it clearer,
>> 
>> Michal Senkyr
>> 
>> 
>> On 5.12.2016 07:43, Tarun Kumar wrote:
> 
>>> Not sure if that's documented in terms of Spark but this is a fairly common 
>>> pattern in scala known as "pimp my library" pattern, you can easily find 
>>> many generic example of using this pattern.
>>> 
>>> If you want I can quickly cook up a short conplete example with 
>>> rdd(although there is nothing really more to my example in earlier mail) ?
>>> 
>>> Thanks 
>>> Tarun Kumar
>>> 
> 
>>> On Mon, 5 Dec 2016 at 7:15 AM, long <>> href="x-msg://22/user/SendEmail.jtp?type=nodenode=20106i=0" 
>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
> 
>>> So is there documentation of this I can refer to? 
>>> 
>>>> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List] 
>>>> <[hidden email] <http://user/SendEmail.jtp?type=node=20104=0>> 
>>>> wrote:
>>>> 
>>> 
>>>> Hi Tenglong,
>>>> 
>>>> In addition to trsell's reply, you can add any method to an rdd without 
>>>> making changes to spark code.
>>>> 
>>>> This can be achieved by using implicit class in your own client code:
>>>> 
>>>> implicit class extendRDD[T](rdd: RDD[T]){
>>>> 
>>>>  def foo()
>>>> 
>>>> }
>>>> 
>>>> Then you basically nees to import this implicit class in scope where you 
>>>> want to use the new foo method.
>>>> 
>>>> Thanks
>>>> Tarun Kumar 
>>>> 
> 
>>>> On Mon, 5 Dec 2016 at 6:59 AM, <>>>  class="">x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0" 
>>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> 
>>>> wrote:
>>> 
>>>> How does your application fetch the spark dependency? Perhaps list your 
>>>> project dependencies and check it's using your dev build.
>>>> 
>>>> 
> 
>>>> On Mon, 5 Dec 2016, 08:47 tenglong, <>>>  class="">x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=1" 
>>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> 
>>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Apparently, I've already tried adding a new method to RDD,
>>>> 
>>>> for example,
>>>> 
>>>> class RDD {
>>>>   def foo() // this is the one I added
>>>> 
>>>>   def map()
>>>> 
>>>>   def collect()
>>>> }
>>>> 
>>>> I can build Spark successfully, but I can't compile my application code
>>>> which calls rdd.foo(), and the error message says
>>>> 
>>>> value foo 

Re: Can I add a new method to RDD class?

2016-12-05 Thread Shixiong(Ryan) Zhu
RDD.sparkContext is public:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD@sparkContext:org.apache.spark.SparkContext

On Mon, Dec 5, 2016 at 1:04 PM, Teng Long <longteng...@gmail.com> wrote:

> Thank you for providing another answer, Holden.
>
> So I did what Tarun and Michal suggested, and it didn’t work out as I want
> to have a new transformation method in RDD class, and need to use that
> RDD’s spark context which is private. So I guess the only thing I can do
> now is to sbt publishLocal?
>
> On Dec 5, 2016, at 9:19 AM, Holden Karau <hol...@pigscanfly.ca> wrote:
>
> Doing that requires publishing a custom version of Spark, you can edit the
> version number do do a publishLocal - but maintaining that change is going
> to be difficult. The other approaches suggested are probably better, but
> also does your method need to be defined on the RDD class? Could you
> instead make a helper object or class to expose whatever functionality you
> need?
>
> On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com> wrote:
>
>> Thank you very much! But why can’t I just add new methods in to the
>> source code of RDD?
>>
>> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers
>> List] <[hidden email]
>> <http://user/SendEmail.jtp?type=node=20107=0>> wrote:
>>
>> A simple Scala example of implicit classes:
>>
>> implicit class EnhancedString(str: String) {
>>   def prefix(prefix: String) = prefix + str
>> }
>>
>> println("World".prefix("Hello "))
>>
>> As Tarun said, you have to import it if it's not in the same class where
>> you use it.
>>
>> Hope this makes it clearer,
>>
>> Michal Senkyr
>>
>> On 5.12.2016 07:43, Tarun Kumar wrote:
>>
>> Not sure if that's documented in terms of Spark but this is a fairly
>> common pattern in scala known as "pimp my library" pattern, you can easily
>> find many generic example of using this pattern. If you want I can quickly
>> cook up a short conplete example with rdd(although there is nothing really
>> more to my example in earlier mail) ? Thanks Tarun Kumar
>>
>> On Mon, 5 Dec 2016 at 7:15 AM, long <> rel="nofollow" link="external" class="">[hidden email]> wrote:
>>
>> So is there documentation of this I can refer to?
>>
>> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers
>> List] <[hidden email]
>> <http://user/SendEmail.jtp?type=node=20104=0>> wrote:
>>
>> Hi Tenglong, In addition to trsell's reply, you can add any method to an
>> rdd without making changes to spark code. This can be achieved by using
>> implicit class in your own client code: implicit class extendRDD[T](rdd:
>> RDD[T]){ def foo() } Then you basically nees to import this implicit class
>> in scope where you want to use the new foo method. Thanks Tarun Kumar
>>
>> On Mon, 5 Dec 2016 at 6:59 AM, <> SendEmail.jtp?type=nodeamp;node=20102amp;i=0" class="">
>> x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0"
>> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>>
>> How does your application fetch the spark dependency? Perhaps list your
>> project dependencies and check it's using your dev build.
>>
>> On Mon, 5 Dec 2016, 08:47 tenglong, <> SendEmail.jtp?type=nodeamp;node=20102amp;i=1" class="">
>> x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=1"
>> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>>
>> Hi,
>>
>> Apparently, I've already tried adding a new method to RDD,
>>
>> for example,
>>
>> class RDD {
>>   def foo() // this is the one I added
>>
>>   def map()
>>
>>   def collect()
>> }
>>
>> I can build Spark successfully, but I can't compile my application code
>> which calls rdd.foo(), and the error message says
>>
>> value foo is not a member of org.apache.spark.rdd.RDD[String]
>>
>> So I am wondering if there is any mechanism prevents me from doing this or
>> something I'm doing wrong?
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-
>> developers-list.1001551.n3.nabble.com/Can-I-add-a-new-
>> method-to-RDD-class-tp20100.html
>> Sent from the Apache Spark Developers List mailing list archive at
>>

Re: Can I add a new method to RDD class?

2016-12-05 Thread Holden Karau
Doing that requires publishing a custom version of Spark, you can edit the
version number do do a publishLocal - but maintaining that change is going
to be difficult. The other approaches suggested are probably better, but
also does your method need to be defined on the RDD class? Could you
instead make a helper object or class to expose whatever functionality you
need?

On Mon, Dec 5, 2016 at 6:06 PM long <longteng...@gmail.com> wrote:

> Thank you very much! But why can’t I just add new methods in to the source
> code of RDD?
>
> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers
> List] <[hidden email]
> <http:///user/SendEmail.jtp?type=node=20107=0>> wrote:
>
> A simple Scala example of implicit classes:
>
> implicit class EnhancedString(str: String) {
>   def prefix(prefix: String) = prefix + str
> }
>
> println("World".prefix("Hello "))
>
> As Tarun said, you have to import it if it's not in the same class where
> you use it.
>
> Hope this makes it clearer,
>
> Michal Senkyr
>
> On 5.12.2016 07:43, Tarun Kumar wrote:
>
> Not sure if that's documented in terms of Spark but this is a fairly
> common pattern in scala known as "pimp my library" pattern, you can easily
> find many generic example of using this pattern. If you want I can quickly
> cook up a short conplete example with rdd(although there is nothing really
> more to my example in earlier mail) ? Thanks Tarun Kumar
>
> On Mon, 5 Dec 2016 at 7:15 AM, long < href="x-msg://22/user/SendEmail.jtp?type=nodenode=20106i=0"
> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> So is there documentation of this I can refer to?
>
> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List]
> <[hidden email] <http://user/SendEmail.jtp?type=node=20104=0>>
> wrote:
>
> Hi Tenglong, In addition to trsell's reply, you can add any method to an
> rdd without making changes to spark code. This can be achieved by using
> implicit class in your own client code: implicit class extendRDD[T](rdd:
> RDD[T]){ def foo() } Then you basically nees to import this implicit class
> in scope where you want to use the new foo method. Thanks Tarun Kumar
>
> On Mon, 5 Dec 2016 at 6:59 AM, < class="">x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0"
> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> How does your application fetch the spark dependency? Perhaps list your
> project dependencies and check it's using your dev build.
>
> On Mon, 5 Dec 2016, 08:47 tenglong, < class="">x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=1"
> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> Hi,
>
> Apparently, I've already tried adding a new method to RDD,
>
> for example,
>
> class RDD {
>   def foo() // this is the one I added
>
>   def map()
>
>   def collect()
> }
>
> I can build Spark successfully, but I can't compile my application code
> which calls rdd.foo(), and the error message says
>
> value foo is not a member of org.apache.spark.rdd.RDD[String]
>
> So I am wondering if there is any mechanism prevents me from doing this or
> something I'm doing wrong?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com <http://nabble.com/>.
>
> ---------
>
> To unsubscribe e-mail:  class="">x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=2"
> target="_top" rel="nofollow" link="external" class="">[hidden email]
>
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html
> To unsubscribe from Can I add a new method to RDD class?, click here.
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
&

Re: Can I add a new method to RDD class?

2016-12-05 Thread long
Thank you very much! But why can’t I just add new methods in to the source code 
of RDD?

> On Dec 5, 2016, at 3:15 AM, Michal Šenkýř [via Apache Spark Developers List] 
> <ml-node+s1001551n20106...@n3.nabble.com> wrote:
> 
> A simple Scala example of implicit classes:
> 
> implicit class EnhancedString(str: String) {
>   def prefix(prefix: String) = prefix + str
> }
> 
> println("World".prefix("Hello "))
> As Tarun said, you have to import it if it's not in the same class where you 
> use it.
> 
> Hope this makes it clearer,
> 
> Michal Senkyr
> 
> 
> On 5.12.2016 07:43, Tarun Kumar wrote:
>> Not sure if that's documented in terms of Spark but this is a fairly common 
>> pattern in scala known as "pimp my library" pattern, you can easily find 
>> many generic example of using this pattern.
>> 
>> If you want I can quickly cook up a short conplete example with rdd(although 
>> there is nothing really more to my example in earlier mail) ?
>> 
>> Thanks 
>> Tarun Kumar
>> 
>> On Mon, 5 Dec 2016 at 7:15 AM, long <[hidden email] 
>> > wrote:
>> So is there documentation of this I can refer to? 
>> 
>>> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List] 
>>> <[hidden email] <http://user/SendEmail.jtp?type=node=20104=0>> wrote:
>>> 
>> 
>>> Hi Tenglong,
>>> 
>>> In addition to trsell's reply, you can add any method to an rdd without 
>>> making changes to spark code.
>>> 
>>> This can be achieved by using implicit class in your own client code:
>>> 
>>> implicit class extendRDD[T](rdd: RDD[T]){
>>> 
>>>  def foo()
>>> 
>>> }
>>> 
>>> Then you basically nees to import this implicit class in scope where you 
>>> want to use the new foo method.
>>> 
>>> Thanks
>>> Tarun Kumar 
>>> 
>> 
>>> On Mon, 5 Dec 2016 at 6:59 AM, <>> href="x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0" 
>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>> 
>>> How does your application fetch the spark dependency? Perhaps list your 
>>> project dependencies and check it's using your dev build.
>>> 
>>> 
>> 
>>> On Mon, 5 Dec 2016, 08:47 tenglong, <>> href="x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=1" 
>>> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>> 
>>> Hi,
>>> 
>>> Apparently, I've already tried adding a new method to RDD,
>>> 
>>> for example,
>>> 
>>> class RDD {
>>>   def foo() // this is the one I added
>>> 
>>>   def map()
>>> 
>>>   def collect()
>>> }
>>> 
>>> I can build Spark successfully, but I can't compile my application code
>>> which calls rdd.foo(), and the error message says
>>> 
>>> value foo is not a member of org.apache.spark.rdd.RDD[String]
>>> 
>>> So I am wondering if there is any mechanism prevents me from doing this or
>>> something I'm doing wrong?
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: 
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
>>>  
>>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html>
>>> Sent from the Apache Spark Developers List mailing list archive at 
>>> Nabble.com <http://nabble.com/>.
>>> 
>>> -
>> 
>>> To unsubscribe e-mail: >> href="x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=2" 
>>> target="_top" rel="nofollow" link="external" class="">[hidden email]
>>> 
>>> 
>>> 
>>> If you reply to this email, your message will be added to the discussion 
>>> below:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html
>>>  
>>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html>
>>> To unsubscribe from Can I add a new method to RDD class?, click here <>.
>>&g

Re: Can I add a new method to RDD class?

2016-12-05 Thread Michal Šenkýř

A simple Scala example of implicit classes:

implicit  class  EnhancedString(str:String) {
  def  prefix(prefix:String)=  prefix+  str
}

println("World".prefix("Hello "))

As Tarun said, you have to import it if it's not in the same class where 
you use it.


Hope this makes it clearer,

Michal Senkyr


On 5.12.2016 07:43, Tarun Kumar wrote:
Not sure if that's documented in terms of Spark but this is a fairly 
common pattern in scala known as "pimp my library" pattern, you can 
easily find many generic example of using this pattern. If you want I 
can quickly cook up a short conplete example with rdd(although there 
is nothing really more to my example in earlier mail) ? Thanks Tarun Kumar


On Mon, 5 Dec 2016 at 7:15 AM, long <longteng...@gmail.com 
<mailto:longteng...@gmail.com>> wrote:


So is there documentation of this I can refer to?


On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark
Developers List] <[hidden email]
<http:///user/SendEmail.jtp?type=node=20104=0>> wrote:

Hi Tenglong, In addition to trsell's reply, you can add any
method to an rdd without making changes to spark code. This can
be achieved by using implicit class in your own client code:
implicit class extendRDD[T](rdd: RDD[T]){ def foo() } Then you
basically nees to import this implicit class in scope where you
want to use the new foo method. Thanks Tarun Kumar

On Mon, 5 Dec 2016 at 6:59 AM, <[hidden
email]> wrote:

How does your application fetch the spark dependency? Perhaps
list your project dependencies and check it's using your dev
build.


On Mon, 5 Dec 2016, 08:47 tenglong, <[hidden
email]> wrote:

Hi,

Apparently, I've already tried adding a new method to RDD,

for example,

class RDD {
  def foo() // this is the one I added

  def map()

  def collect()
}

I can build Spark successfully, but I can't compile my
application code
which calls rdd.foo(), and the error message says

value foo is not a member of org.apache.spark.rdd.RDD[String]

So I am wondering if there is any mechanism prevents me
from doing this or
something I'm doing wrong?




--
View this message in context:
        
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
Sent from the Apache Spark Developers List mailing list
archive at Nabble.com <http://Nabble.com>.


-

To unsubscribe e-mail: [hidden email]




If you reply to this email, your message will be added to the
discussion below:
    
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html

To unsubscribe from Can I add a new method to RDD class?, click here.
NAML

<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




------------------------
View this message in context: Re: Can I add a new method to RDD
    class?
    
<http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20104.html>
Sent from the Apache Spark Developers List mailing list archive
<http://apache-spark-developers-list.1001551.n3.nabble.com/> at
Nabble.com.





Re: Can I add a new method to RDD class?

2016-12-04 Thread Tarun Kumar
Not sure if that's documented in terms of Spark but this is a fairly common
pattern in scala known as "pimp my library" pattern, you can easily find
many generic example of using this pattern.

If you want I can quickly cook up a short conplete example with
rdd(although there is nothing really more to my example in earlier mail) ?

Thanks
Tarun Kumar

On Mon, 5 Dec 2016 at 7:15 AM, long <longteng...@gmail.com> wrote:

> So is there documentation of this I can refer to?
>
> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List]
> <[hidden email] <http:///user/SendEmail.jtp?type=node=20104=0>>
> wrote:
>
> Hi Tenglong,
>
> In addition to trsell's reply, you can add any method to an rdd without
> making changes to spark code.
>
> This can be achieved by using implicit class in your own client code:
>
> implicit class extendRDD[T](rdd: RDD[T]){
>
> def foo()
>
> }
>
> Then you basically nees to import this implicit class in scope where you
> want to use the new foo method.
>
> Thanks
> Tarun Kumar
>
> On Mon, 5 Dec 2016 at 6:59 AM, < href="x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=0"
> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> How does your application fetch the spark dependency? Perhaps list your
> project dependencies and check it's using your dev build.
>
> On Mon, 5 Dec 2016, 08:47 tenglong, < href="x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=1"
> target="_top" rel="nofollow" link="external" class="">[hidden email]> wrote:
>
> Hi,
>
> Apparently, I've already tried adding a new method to RDD,
>
> for example,
>
> class RDD {
>   def foo() // this is the one I added
>
>   def map()
>
>   def collect()
> }
>
> I can build Spark successfully, but I can't compile my application code
> which calls rdd.foo(), and the error message says
>
> value foo is not a member of org.apache.spark.rdd.RDD[String]
>
> So I am wondering if there is any mechanism prevents me from doing this or
> something I'm doing wrong?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
>
> To unsubscribe e-mail:  href="x-msg://19/user/SendEmail.jtp?type=nodenode=20102i=2"
> target="_top" rel="nofollow" link="external" class="">[hidden email]
>
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html
> To unsubscribe from Can I add a new method to RDD class?, click here.
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
> --
> View this message in context: Re: Can I add a new method to RDD class?
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20104.html>
> Sent from the Apache Spark Developers List mailing list archive
> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at
> Nabble.com.
>


Re: Can I add a new method to RDD class?

2016-12-04 Thread long
So is there documentation of this I can refer to? 

> On Dec 5, 2016, at 1:07 AM, Tarun Kumar [via Apache Spark Developers List] 
> <ml-node+s1001551n20102...@n3.nabble.com> wrote:
> 
> Hi Tenglong,
> 
> In addition to trsell's reply, you can add any method to an rdd without 
> making changes to spark code.
> 
> This can be achieved by using implicit class in your own client code:
> 
> implicit class extendRDD[T](rdd: RDD[T]){
> 
>  def foo()
> 
> }
> 
> Then you basically nees to import this implicit class in scope where you want 
> to use the new foo method.
> 
> Thanks
> Tarun Kumar 
> 
> On Mon, 5 Dec 2016 at 6:59 AM, <[hidden email] 
> > wrote:
> How does your application fetch the spark dependency? Perhaps list your 
> project dependencies and check it's using your dev build.
> 
> 
> On Mon, 5 Dec 2016, 08:47 tenglong, <[hidden email] 
> > wrote:
> Hi,
> 
> Apparently, I've already tried adding a new method to RDD,
> 
> for example,
> 
> class RDD {
>   def foo() // this is the one I added
> 
>   def map()
> 
>   def collect()
> }
> 
> I can build Spark successfully, but I can't compile my application code
> which calls rdd.foo(), and the error message says
> 
> value foo is not a member of org.apache.spark.rdd.RDD[String]
> 
> So I am wondering if there is any mechanism prevents me from doing this or
> something I'm doing wrong?
> 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
>  
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html>
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
> 
> -----------------
> To unsubscribe e-mail: [hidden email] 
> 
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html
>  
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20102.html>
> To unsubscribe from Can I add a new method to RDD class?, click here 
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=20100=bG9uZ3RlbmcuY3FAZ21haWwuY29tfDIwMTAwfC0xNzQ1MzUzNzE=>.
> NAML 
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20104.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Can I add a new method to RDD class?

2016-12-04 Thread long
So im my sbt build script, I have the same line as instructed in the 
quickstart guide <http://spark.apache.org/docs/latest/quick-start.html>  ,

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.2"

And since I was able to see all the other logs I added into the spark source
code, so I'm pretty sure the application is using the one I just built.

Thanks!



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100p20103.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Can I add a new method to RDD class?

2016-12-04 Thread Tarun Kumar
Hi Tenglong,

In addition to trsell's reply, you can add any method to an rdd without
making changes to spark code.

This can be achieved by using implicit class in your own client code:

implicit class extendRDD[T](rdd: RDD[T]){

def foo()

}

Then you basically nees to import this implicit class in scope where you
want to use the new foo method.

Thanks
Tarun Kumar

On Mon, 5 Dec 2016 at 6:59 AM, <trs...@gmail.com> wrote:

> How does your application fetch the spark dependency? Perhaps list your
> project dependencies and check it's using your dev build.
>
> On Mon, 5 Dec 2016, 08:47 tenglong, <longteng...@gmail.com> wrote:
>
> Hi,
>
> Apparently, I've already tried adding a new method to RDD,
>
> for example,
>
> class RDD {
>   def foo() // this is the one I added
>
>   def map()
>
>   def collect()
> }
>
> I can build Spark successfully, but I can't compile my application code
> which calls rdd.foo(), and the error message says
>
> value foo is not a member of org.apache.spark.rdd.RDD[String]
>
> So I am wondering if there is any mechanism prevents me from doing this or
> something I'm doing wrong?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: Can I add a new method to RDD class?

2016-12-04 Thread trsell
How does your application fetch the spark dependency? Perhaps list your
project dependencies and check it's using your dev build.

On Mon, 5 Dec 2016, 08:47 tenglong, <longteng...@gmail.com> wrote:

> Hi,
>
> Apparently, I've already tried adding a new method to RDD,
>
> for example,
>
> class RDD {
>   def foo() // this is the one I added
>
>   def map()
>
>   def collect()
> }
>
> I can build Spark successfully, but I can't compile my application code
> which calls rdd.foo(), and the error message says
>
> value foo is not a member of org.apache.spark.rdd.RDD[String]
>
> So I am wondering if there is any mechanism prevents me from doing this or
> something I'm doing wrong?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Can I add a new method to RDD class?

2016-12-04 Thread tenglong
Hi,

Apparently, I've already tried adding a new method to RDD,

for example,

class RDD {
  def foo() // this is the one I added

  def map()

  def collect()
}

I can build Spark successfully, but I can't compile my application code
which calls rdd.foo(), and the error message says

value foo is not a member of org.apache.spark.rdd.RDD[String]

So I am wondering if there is any mechanism prevents me from doing this or
something I'm doing wrong?




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-add-a-new-method-to-RDD-class-tp20100.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org