RE: SparkR for accumulo
Could you submit a JIRA for this? We can discuss it there. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Thursday, September 24, 2015 12:12 PM To: Sun, Rui; user Subject: Re: SparkR for accumulo Hi, Is there any other way to proceed with it to create RRDD from a source RDD other than text RDD?Or to use any other format of data stored in HDFS in sparkR? Also please elaborate me the kind of step missing in sparkR fro this. Thanks and Regards Madhvi Gupta On Thursday 24 September 2015 08:12 AM, Sun, Rui wrote: > No. > > It is possible you create a helper function which can creat accumulo data > RDDs in Scala or Java (maybe put such code in a JAR, add using --jar jar> on the command line to start SparkR to use it ?) and in SparkR you can > use the private functions like callJMethod to call it and the created RDD > objects can be referenced on R side. > > However, there is a critical step missing in SparkR now, which is the support > of conversion from a source RDD (other than text file RDD) to RRDD. If you > can't convert a source RDD from JVM to RRDD, you can't further use SparkR RDD > API to apply transformations on it. > > -Original Message- > From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] > Sent: Wednesday, September 23, 2015 11:42 AM > To: Sun, Rui; user > Subject: Re: SparkR for accumulo > > Hi Rui, > > Cant we use the accumulo data RDD created from JAVA in spark, in sparkR? > > Thanks and Regards > Madhvi Gupta > > On Tuesday 22 September 2015 04:42 PM, Sun, Rui wrote: >> I am afraid that there is no support for accumulo in SparkR now, because: >> >> 1. It seems that there is no data source support for accumulo, so we >> can't create SparkR dataframe on accumulo 2. It is possible to create RDD >> from accumulo via AccumuloInputFormat in Scala. But unfortunately, SparkR >> does not support creating RDD from Hadoop files other than text file. >> >> -Original Message- >> From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] >> Sent: Tuesday, September 22, 2015 6:25 PM >> To: user >> Subject: SparkR for accumulo >> >> Hi, >> >> I want to process accumulo data in R through sparkR.Can anyone help me and >> let me know how to get accumulo data in spark to be used in R? >> >> -- >> Thanks and Regards >> Madhvi Gupta >> >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For >> additional commands, e-mail: user-h...@spark.apache.org >> > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For > additional commands, e-mail: user-h...@spark.apache.org >
Re: SparkR for accumulo
Hi, Is there any other way to proceed with it to create RRDD from a source RDD other than text RDD?Or to use any other format of data stored in HDFS in sparkR? Also please elaborate me the kind of step missing in sparkR fro this. Thanks and Regards Madhvi Gupta On Thursday 24 September 2015 08:12 AM, Sun, Rui wrote: No. It is possible you create a helper function which can creat accumulo data RDDs in Scala or Java (maybe put such code in a JAR, add using --jar on the command line to start SparkR to use it ?) and in SparkR you can use the private functions like callJMethod to call it and the created RDD objects can be referenced on R side. However, there is a critical step missing in SparkR now, which is the support of conversion from a source RDD (other than text file RDD) to RRDD. If you can't convert a source RDD from JVM to RRDD, you can't further use SparkR RDD API to apply transformations on it. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Wednesday, September 23, 2015 11:42 AM To: Sun, Rui; user Subject: Re: SparkR for accumulo Hi Rui, Cant we use the accumulo data RDD created from JAVA in spark, in sparkR? Thanks and Regards Madhvi Gupta On Tuesday 22 September 2015 04:42 PM, Sun, Rui wrote: I am afraid that there is no support for accumulo in SparkR now, because: 1. It seems that there is no data source support for accumulo, so we can't create SparkR dataframe on accumulo 2. It is possible to create RDD from accumulo via AccumuloInputFormat in Scala. But unfortunately, SparkR does not support creating RDD from Hadoop files other than text file. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Tuesday, September 22, 2015 6:25 PM To: user Subject: SparkR for accumulo Hi, I want to process accumulo data in R through sparkR.Can anyone help me and let me know how to get accumulo data in spark to be used in R? -- Thanks and Regards Madhvi Gupta - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SparkR for accumulo
Ohk.Thanks Thanks and Regards Madhvi Gupta On Thursday 24 September 2015 08:12 AM, Sun, Rui wrote: No. It is possible you create a helper function which can creat accumulo data RDDs in Scala or Java (maybe put such code in a JAR, add using --jar on the command line to start SparkR to use it ?) and in SparkR you can use the private functions like callJMethod to call it and the created RDD objects can be referenced on R side. However, there is a critical step missing in SparkR now, which is the support of conversion from a source RDD (other than text file RDD) to RRDD. If you can't convert a source RDD from JVM to RRDD, you can't further use SparkR RDD API to apply transformations on it. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Wednesday, September 23, 2015 11:42 AM To: Sun, Rui; user Subject: Re: SparkR for accumulo Hi Rui, Cant we use the accumulo data RDD created from JAVA in spark, in sparkR? Thanks and Regards Madhvi Gupta On Tuesday 22 September 2015 04:42 PM, Sun, Rui wrote: I am afraid that there is no support for accumulo in SparkR now, because: 1. It seems that there is no data source support for accumulo, so we can't create SparkR dataframe on accumulo 2. It is possible to create RDD from accumulo via AccumuloInputFormat in Scala. But unfortunately, SparkR does not support creating RDD from Hadoop files other than text file. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Tuesday, September 22, 2015 6:25 PM To: user Subject: SparkR for accumulo Hi, I want to process accumulo data in R through sparkR.Can anyone help me and let me know how to get accumulo data in spark to be used in R? -- Thanks and Regards Madhvi Gupta - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
RE: SparkR for accumulo
No. It is possible you create a helper function which can creat accumulo data RDDs in Scala or Java (maybe put such code in a JAR, add using --jar on the command line to start SparkR to use it ?) and in SparkR you can use the private functions like callJMethod to call it and the created RDD objects can be referenced on R side. However, there is a critical step missing in SparkR now, which is the support of conversion from a source RDD (other than text file RDD) to RRDD. If you can't convert a source RDD from JVM to RRDD, you can't further use SparkR RDD API to apply transformations on it. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Wednesday, September 23, 2015 11:42 AM To: Sun, Rui; user Subject: Re: SparkR for accumulo Hi Rui, Cant we use the accumulo data RDD created from JAVA in spark, in sparkR? Thanks and Regards Madhvi Gupta On Tuesday 22 September 2015 04:42 PM, Sun, Rui wrote: > I am afraid that there is no support for accumulo in SparkR now, because: > > 1. It seems that there is no data source support for accumulo, so we > can't create SparkR dataframe on accumulo 2. It is possible to create RDD > from accumulo via AccumuloInputFormat in Scala. But unfortunately, SparkR > does not support creating RDD from Hadoop files other than text file. > > -Original Message- > From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] > Sent: Tuesday, September 22, 2015 6:25 PM > To: user > Subject: SparkR for accumulo > > Hi, > > I want to process accumulo data in R through sparkR.Can anyone help me and > let me know how to get accumulo data in spark to be used in R? > > -- > Thanks and Regards > Madhvi Gupta > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For > additional commands, e-mail: user-h...@spark.apache.org > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SparkR for accumulo
Hi Rui, Cant we use the accumulo data RDD created from JAVA in spark, in sparkR? Thanks and Regards Madhvi Gupta On Tuesday 22 September 2015 04:42 PM, Sun, Rui wrote: I am afraid that there is no support for accumulo in SparkR now, because: 1. It seems that there is no data source support for accumulo, so we can't create SparkR dataframe on accumulo 2. It is possible to create RDD from accumulo via AccumuloInputFormat in Scala. But unfortunately, SparkR does not support creating RDD from Hadoop files other than text file. -Original Message- From: madhvi.gupta [mailto:madhvi.gu...@orkash.com] Sent: Tuesday, September 22, 2015 6:25 PM To: user Subject: SparkR for accumulo Hi, I want to process accumulo data in R through sparkR.Can anyone help me and let me know how to get accumulo data in spark to be used in R? -- Thanks and Regards Madhvi Gupta - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org