[SparkR] Compare datetime with Sys.time() throws error in R (>= 4.2.0)

2023-01-03 Thread Vivek Atal
acter vector (R: Object Classes); hence this type of check itself was not a good idea in the first place. t <- Sys.time()sdf <- SparkR::createDataFrame(data.frame(xx = t + c(-1,1,-1,1,-1))) SparkR::collect(SparkR::filter(sdf, SparkR::column("xx") > t)) The sugge

Re: [R] SparkR on conda-forge

2021-12-19 Thread Hyukjin Kwon
Awesome! On Mon, 20 Dec 2021 at 09:43, yonghua wrote: > Nice release. thanks for sharing. > > On 2021/12/20 3:55, Maciej wrote: > > FYI ‒ thanks to good folks from conda-forge we have now these: > > - > To unsubscribe e-mail:

Re: [R] SparkR on conda-forge

2021-12-19 Thread yonghua
Nice release. thanks for sharing. On 2021/12/20 3:55, Maciej wrote: FYI ‒ thanks to good folks from conda-forge we have now these: - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[R] SparkR on conda-forge

2021-12-19 Thread Maciej
Hi everyone, FYI ‒ thanks to good folks from conda-forge we have now these: * https://github.com/conda-forge/r-sparkr-feedstock * https://anaconda.org/conda-forge/r-sparkr -- Best regards, Maciej Szymkiewicz Web: https://zero323.net PGP: A30CEF0C31A501EC OpenPGP_signature Description

Re: [SparkR] gapply with strings with arrow

2020-10-10 Thread Hyukjin Kwon
> Source) > ... > > When I looked at the source code there - it is all stubs. > > Is there a proper way to use arrow in gapply in SparkR? > > BR, > > Jacel > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >

[SparkR] gapply with strings with arrow

2020-10-07 Thread Jacek Pliszka
pply(Unknown Source) ... When I looked at the source code there - it is all stubs. Is there a proper way to use arrow in gapply in SparkR? BR, Jacel - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Fail to use SparkR of 3.0 preview 2

2020-01-07 Thread Xiao Li
-env-quot-S3methods-quot-td4755490.html > ). > Yes, seems we should make sure we build SparkR in an old version. > Since that support for R prior to version 3.4 is deprecated as of Spark > 3.0.0, we could use either R 3.4 or matching to Jenkins's (R 3.1 IIRC) for > Spark 3.0 release.

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Hyukjin Kwon
I was randomly googling out of curiosity, and seems indeed that's the problem ( https://r.789695.n4.nabble.com/Error-in-rbind-info-getNamespaceInfo-env-quot-S3methods-quot-td4755490.html ). Yes, seems we should make sure we build SparkR in an old version. Since that support for R prior to version

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Jeff Zhang
Yes, I guess so. But R 3.6.2 is just released this month, I think we should use an older version to build SparkR. Felix Cheung 于2019年12月27日周五 上午10:43写道: > Maybe it’s the reverse - the package is built to run in latest but not > compatible with slightly older (3.5.2 was De

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Felix Cheung
Maybe it’s the reverse - the package is built to run in latest but not compatible with slightly older (3.5.2 was Dec 2018) From: Jeff Zhang Sent: Thursday, December 26, 2019 5:36:50 PM To: Felix Cheung Cc: user.spark Subject: Re: Fail to use SparkR of 3.0

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Jeff Zhang
AM > *To:* user.spark > *Subject:* Fail to use SparkR of 3.0 preview 2 > > I tried SparkR of spark 3.0 preview 2, but hit the following issue. > > Error in rbind(info, getNamespaceInfo(env, "S3methods")) : > number of columns of matrices must match (see arg 2) > Error:

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Felix Cheung
It looks like a change in the method signature in R base packages. Which version of R are you running on? From: Jeff Zhang Sent: Thursday, December 26, 2019 12:46:12 AM To: user.spark Subject: Fail to use SparkR of 3.0 preview 2 I tried SparkR of spark 3.0

Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Jeff Zhang
I tried SparkR of spark 3.0 preview 2, but hit the following issue. Error in rbind(info, getNamespaceInfo(env, "S3methods")) : number of columns of matrices must match (see arg 2) Error: package or namespace load failed for ‘SparkR’ in rbind(info, getNamespaceInfo(env, "S3meth

Re: SparkR integration with Hive 3 spark-r

2019-11-24 Thread Felix Cheung
I think you will get more answer if you ask without SparkR. You question is independent on SparkR. Spark support for Hive 3.x (3.1.2) was added here https://github.com/apache/spark/commit/1b404b9b9928144e9f527ac7b1caa15f932c2649 You should be able to connect Spark to Hive metastore

Re: SparkR integration with Hive 3 spark-r

2019-11-22 Thread Alfredo Marquez
e metastore 2.3.5 - no mention of hive 3 metastore. I made several >> tests on this in the past[1] and it seems to handle any hive metastore >> version. >> >> However spark cannot read hive managed table AKA transactional tables. >> So I would say you should be able to read any

Re: SparkR integration with Hive 3 spark-r

2019-11-18 Thread Alfredo Marquez
o I would say you should be able to read any hive 3 regular table with > any of spark, pyspark or sparkR. > > > [1] > https://parisni.frama.io/posts/playing-with-hive-spark-metastore-versions/ > > On Mon, Nov 18, 2019 at 11:23:50AM -0600, Alfredo Marquez wrote: > >

Re: SparkR integration with Hive 3 spark-r

2019-11-18 Thread Nicolas Paris
transactional tables. So I would say you should be able to read any hive 3 regular table with any of spark, pyspark or sparkR. [1] https://parisni.frama.io/posts/playing-with-hive-spark-metastore-versions/ On Mon, Nov 18, 2019 at 11:23:50AM -0600, Alfredo Marquez wrote: > Hello, > > Ou

SparkR integration with Hive 3 spark-r

2019-11-18 Thread Alfredo Marquez
Hello, Our company is moving to Hive 3, and they are saying that there is no SparkR implementation in Spark 2.3.x + that will connect to Hive 3. Is this true? If it is true, will this be addressed in the Spark 3 release? I don't use python, so losing SparkR to get work done on Hadoop is a huge

Re: [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?

2019-07-16 Thread Felix Cheung
: Monday, July 15, 2019 6:58:32 AM To: user@spark.apache.org Subject: [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame? Hi all, Forgive this naïveté, I’m looking for reassurance from some experts! In the past we created a tailored Spark library for our

[PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?

2019-07-15 Thread Fiske, Danny
ally write our functions with PySpark and potentially create a SparkR "wrapper" over the top, leading to the question: Given a function written with PySpark that accepts a DataFrame parameter, is there a way to invoke this function using a SparkR DataFrame? Is there any reason to pur

Re: sparksql in sparkR?

2019-06-07 Thread Felix Cheung
This seem to be more a question of spark-sql shell? I may suggest you change the email title to get more attention. From: ya Sent: Wednesday, June 5, 2019 11:48:17 PM To: user@spark.apache.org Subject: sparksql in sparkR? Dear list, I am trying to use sparksql

sparksql in sparkR?

2019-06-06 Thread ya
Dear list, I am trying to use sparksql within my R, I am having the following questions, could you give me some advice please? Thank you very much. 1. I connect my R and spark using the library sparkR, probably some of the members here also are R users? Do I understand correctly that SparkSQL

Re: SparkR + binary type + how to get value

2019-02-19 Thread Felix Cheung
from the second image it looks like there is protocol mismatch. I’d check if the SparkR package running there on Livy machine matches the Spark java release. But in any case this seems more an issue with Livy config. I’d suggest checking with the community

Re: SparkR + binary type + how to get value

2019-02-19 Thread Thijs Haarhuis
for it at: https://jira.apache.org/jira/browse/LIVY-558 When I call the spark.lapply function it reports that SparkR is not initialized. I have looked into the spark.lapply function and it seems there is no spark context. Any idea how I can debug this? I hope you can help. Regards, Thijs

Re: SparkR + binary type + how to get value

2019-02-17 Thread Felix Cheung
: Thijs Haarhuis Sent: Thursday, February 14, 2019 4:01 AM To: Felix Cheung; user@spark.apache.org Subject: Re: SparkR + binary type + how to get value Hi Felix, Sure.. I have the following code: printSchema(results) cat("\n\n\n") firstRow <- first(results

Re: SparkR + binary type + how to get value

2019-02-14 Thread Thijs Haarhuis
Any idea how to get the actual value, or how to process the individual bytes? Thanks Thijs From: Felix Cheung Sent: Thursday, February 14, 2019 5:31 AM To: Thijs Haarhuis; user@spark.apache.org Subject: Re: SparkR + binary type + how to get value Please share

Re: SparkR + binary type + how to get value

2019-02-13 Thread Felix Cheung
Please share your code From: Thijs Haarhuis Sent: Wednesday, February 13, 2019 6:09 AM To: user@spark.apache.org Subject: SparkR + binary type + how to get value Hi all, Does anybody have any experience in accessing the data from a column which has a binary

SparkR + binary type + how to get value

2019-02-13 Thread Thijs Haarhuis
Hi all, Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R? I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it. In my case I collect the spark data frame to a R

Re: SparkR issue

2018-10-14 Thread Felix Cheung
1 seems like its spending a lot of time in R (slicing the data I guess?) and not with Spark 2 could you write it into a csv file locally and then read it from Spark? From: ayan guha Sent: Monday, October 8, 2018 11:21 PM To: user Subject: SparkR issue Hi We

SparkR issue

2018-10-09 Thread ayan guha
Hi We are seeing some weird behaviour in Spark R. We created a R Dataframe with 600K records and 29 columns. Then we tried to convert R DF to SparkDF using df <- SparkR::createDataFrame(rdf) from RStudio. It hanged, we had to kill the process after 1-2 hours. We also tried following:

Any good book recommendations for SparkR

2018-04-30 Thread @Nandan@
Hi Team, Any good book recommendations for get in-depth knowledge from zero to production. Let me know. Thanks.

package reload in dapply SparkR

2018-04-10 Thread Deepansh Goyal
I have a native R model and doing structured streaming on it. Data comes from Kafka and goes into dapply method where my model does prediction and data is written to sink. Problem:- My model requires caret package. Inside dapply function for every stream job, caret package is loaded again which

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread chandan prakash
an earlier version with devtools? will > follow up for a fix. > > _ > From: Hyukjin Kwon <gurwls...@gmail.com> > Sent: Wednesday, February 14, 2018 6:49 PM > Subject: Re: SparkR test script issue: unable to run run-tests.h on spark > 2.2 > To: chand

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread Felix Cheung
Yes it is issue with the newer release of testthat. To workaround could you install an earlier version with devtools? will follow up for a fix. _ From: Hyukjin Kwon <gurwls...@gmail.com> Sent: Wednesday, February 14, 2018 6:49 PM Subject: Re: SparkR test script

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread Hyukjin Kwon
>From a very quick look, I think testthat version issue with SparkR. I had to fix that version to 1.x before in AppVeyor. There are few details in https://github.com/apache/spark/pull/20003 Can you check and lower testthat version? On 14 Feb 2018 6:09 pm, "chandan prakash"

SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread chandan prakash
Hi All, I am trying to run test script of R under ./R/run-tests.sh but hitting same ERROR everytime. I tried running on mac as well as centos machine, same issue coming up. I am using spark 2.2 (branch-2.2) I followed from apache doc and followed the steps: 1. installed R 2. installed packages

Re: sparkR 3rd library

2017-09-05 Thread Yanbo Liang
I guess you didn't install R package `genalg` for all worker nodes. This is not built-in package for basic R, so you need to install it to all worker nodes manually or running `install.packages` inside of your SparkR UDF. Regards to how to download third party packages and install them inside

Re: sparkR 3rd library

2017-09-04 Thread Felix Cheung
Can you include the code you call spark.lapply? From: patcharee <patcharee.thong...@uni.no> Sent: Sunday, September 3, 2017 11:46:40 PM To: spar >> user@spark.apache.org Subject: sparkR 3rd library Hi, I am using spark.lapply to execute an exist

sparkR 3rd library

2017-09-04 Thread patcharee
Hi, I am using spark.lapply to execute an existing R script in standalone mode. This script calls a function 'rbga' from a 3rd library 'genalg'. This rbga function works fine in sparkR env when I call it directly, but when I apply this to spark.lapply I get the error could not find function

Re: Update MySQL table via Spark/SparkR?

2017-08-22 Thread Pierce Lamb
...@gmail.com> > *Date: *Monday, August 21, 2017 at 6:44 PM > *To: *Jake Russ <jr...@bloomintelligence.com> > *Cc: *"user@spark.apache.org" <user@spark.apache.org> > *Subject: *Re: Update MySQL table via Spark/SparkR? > > > > Hi Jake, > > This is an

Re: Update MySQL table via Spark/SparkR?

2017-08-22 Thread Jake Russ
:44 PM To: Jake Russ <jr...@bloomintelligence.com> Cc: "user@spark.apache.org" <user@spark.apache.org> Subject: Re: Update MySQL table via Spark/SparkR? Hi Jake, This is an issue across all RDBMs including Oracle etc. When you are updating you have to commit or roll back in RDB

Re: Update MySQL table via Spark/SparkR?

2017-08-21 Thread ayan guha
chnical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 21 August 2017 at 15:50, Jake Russ <jr...@bloomintelligence.com> wrote: > >> Hi everyone, >>

Re: Update MySQL table via Spark/SparkR?

2017-08-21 Thread Mich Talebzadeh
icitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 21 August 2017 at 15:50, Jake Russ <jr...@bloomintelligence.com> wrote: > Hi everyone, > > > > I’m currently using SparkR to read data from a MyS

Update MySQL table via Spark/SparkR?

2017-08-21 Thread Jake Russ
Hi everyone, I’m currently using SparkR to read data from a MySQL database, perform some calculations, and then write the results back to MySQL. Is it still true that Spark does not support UPDATE queries via JDBC? I’ve seen many posts on the internet that Spark’s DataFrameWriter does

Re: [sparkR] [MLlib] : Is word2vec implemented in SparkR MLlib ?

2017-04-21 Thread Felix Cheung
Not currently - how are you planning to use the output from word2vec? From: Radhwane Chebaane <r.cheba...@mindlytix.com> Sent: Thursday, April 20, 2017 4:30:14 AM To: user@spark.apache.org Subject: [sparkR] [MLlib] : Is word2vec implemented in SparkR MLlib

[sparkR] [MLlib] : Is word2vec implemented in SparkR MLlib ?

2017-04-20 Thread Radhwane Chebaane
Hi, I've been experimenting with the Spark *Word2vec* implementation in the MLLib package with Scala and it was very nice. I need to use the same algorithm in R leveraging the power of spark distribution with SparkR. I have been looking on the mailing list and Stackoverflow for any *Word2vec* use

Re: Issue with SparkR setup on RStudio

2017-01-04 Thread Md. Rezaul Karim
Rezaul Karim <rezaul.ka...@insight-centre.org> > Sent: Monday, January 2, 2017 7:58 AM > Subject: Re: Issue with SparkR setup on RStudio > To: Felix Cheung <felixcheun...@hotmail.com> > Cc: spark users <user@spark.apache.org> > > > Hello Cheung, > > Happy

RBackendHandler Error while running ML algorithms with SparkR on RStudio

2017-01-03 Thread Md. Rezaul Karim
, ...) : java.io.IOException: Class not found Here's my source code: Sys.setenv(SPARK_HOME = "spark-2.1.0-bin-hadoop2.7") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) library(SparkR) sparkR.session(appName = "SparkR

Re: Issue with SparkR setup on RStudio

2017-01-02 Thread Felix Cheung
is not set in the Windows tests. _ From: Md. Rezaul Karim <rezaul.ka...@insight-centre.org<mailto:rezaul.ka...@insight-centre.org>> Sent: Monday, January 2, 2017 7:58 AM Subject: Re: Issue with SparkR setup on RStudio To: Felix Cheung <felixcheun...@hotm

Re: Issue with SparkR setup on RStudio

2017-01-02 Thread Md. Rezaul Karim
:57 AM > *To:* spark users > *Subject:* Issue with SparkR setup on RStudio > > > Dear Spark users, > > I am trying to setup SparkR on RStudio to perform some basic data > manipulations and ML modeling. However, I am a strange error while > creating SparkR session or

Re: Issue with SparkR setup on RStudio

2016-12-29 Thread Felix Cheung
nt: Thursday, December 29, 2016 10:24:57 AM To: spark users Subject: Issue with SparkR setup on RStudio Dear Spark users, I am trying to setup SparkR on RStudio to perform some basic data manipulations and ML modeling. However, I am a strange error while creating SparkR session or DataFra

Issue with SparkR setup on RStudio

2016-12-29 Thread Md. Rezaul Karim
Dear Spark users, I am trying to setup SparkR on RStudio to perform some basic data manipulations and ML modeling. However, I am a strange error while creating SparkR session or DataFrame that says: java.lang.IllegalArgumentException Error while instantiating

Re: Does SparkR or SparkMLib support nonlinear optimization with non linear constraints

2016-11-25 Thread Robineast
reasing. > > If this feature is not available, do you plan to have it in your roadmap > anytime. > > TIA > Jyoti > > > If you reply to this email, your message will be added to the discussion > below: > http://apache-spark-user-list.1001560.n3.nabb

Re: How to propagate R_LIBS to sparkr executors

2016-11-17 Thread Felix Cheung
agate R_LIBS to sparkr executors To: <user@spark.apache.org<mailto:user@spark.apache.org>> I'm having an issue with a R module not getting picked up on the slave nodes in mesos. I have the following environment value R_LIBS set and for some reason this environment is only s

How to propagate R_LIBS to sparkr executors

2016-11-16 Thread Rodrick Brown
in sparkr? I’m using Mesos 1.0.1 and Spark 2.0.1 Thanks. -- <http://www.orchardplatform.com/> Rodrick Brown / Site Reliability Engineer +1 917 445 6839 / rodr...@orchardplatform.com <mailto:char...@orchardplatform.com> Orchard Platform 101 5th Avenue, 4th Floor, New York, NY

Re: Issue Running sparkR on YARN

2016-11-09 Thread Felix Cheung
It maybe the Spark executor is running as a different user and it can't see where RScript is? You might want to try putting Rscript path to PATH. Also please see this for the config property to set for the R command to use: https://spark.apache.org/docs/latest/configuration.html#sparkr

Issue Running sparkR on YARN

2016-11-09 Thread Ian.Maloney
Hi, I’m trying to run sparkR (1.5.2) on YARN and I get: java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory This strikes me as odd, because I can go to each node and various users and type Rscript and it works. I’ve done this on each node and sp

Re: Substitute Certain Rows a data Frame using SparkR

2016-10-19 Thread Felix Cheung
your example could be something we support though. Please feel free to open a JIRA for that. _ From: shilp <tshi...@hotmail.com<mailto:tshi...@hotmail.com>> Sent: Monday, October 17, 2016 7:38 AM Subject: Substitute Certain Rows a data Frame using Sp

Substitute Certain Rows a data Frame using SparkR

2016-10-17 Thread shilp
I have a sparkR Data frame and I want to Replace certain Rows of a Column which satisfy certain condition with some value.If it was a simple R data frame then I would do something as follows:df$Column1[df$Column1 == "Value"] = "NewValue" How would i perform similar operation on

Re: SparkR execution hang on when handle a RDD which is converted from DataFrame

2016-10-14 Thread Lantao Jin
;- sparkRHive.init(sc) > sqlString<- > "SELECT > key_id, > rtl_week_beg_dt rawdate, > gmv_plan_rate_amt value > FROM > metrics_moveing_detection_cube > " > df <- sql(sqlString) > rdd<-SparkR:::toRDD(df) > > #hang on case one: take fro

SparkR execution hang on when handle a RDD which is converted from DataFrame

2016-10-13 Thread Lantao Jin
sqlContext <- sparkRHive.init(sc) sqlString<- "SELECT key_id, rtl_week_beg_dt rawdate, gmv_plan_rate_amt value FROM metrics_moveing_detection_cube " df <- sql(sqlString) rdd<-SparkR:::toRDD(df) #hang on case one: take from rdd #take(rdd,3) #hang on case two: convert

RE: as.Date can't be applied to Spark data frame in SparkR

2016-09-19 Thread xingye
Update: the job can finish, but takes a long time on a 10M row data. is there a better solution? From: xing_ma...@hotmail.com To: user@spark.apache.org Subject: as.Date can't be applied to Spark data frame in SparkR Date: Tue, 20 Sep 2016 10:22:17 +0800 Hi, all I've noticed that as.Date can't

as.Date can't be applied to Spark data frame in SparkR

2016-09-19 Thread xingye
Hi, all I've noticed that as.Date can't be applied to Spark data frame. I've created the following UDF and used dapply to change a integer column "aa" to a date with origin as 1960-01-01. change_date<-function(df){ df<-as.POSIXlt(as.Date(df$aa, origin = "1960-01-01", tz = "UTC")) }

Re: SparkR API problem with subsetting distributed data frame

2016-09-11 Thread Bene
? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27692.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: SparkR error: reference is ambiguous.

2016-09-10 Thread Felix Cheung
t:double] > head(c) speed dist 1 0 2 2 0 10 3 0 4 4 0 22 5 0 16 6 0 10 _ From: Bedrytski Aliaksandr <sp...@bedryt.ski<mailto:sp...@bedryt.ski>> Sent: Friday, September 9, 2016 9:13 PM Subject: Re: SparkR error: reference is ambiguous. To: xingye <t

Re: SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Felix Cheung
How are you calling dirs()? What would be x? Is dat a SparkDataFrame? With SparkR, i in dat[i, 4] should be an logical expression for row, eg. df[df$age %in% c(19, 30), 1:2] On Sat, Sep 10, 2016 at 11:02 AM -0700, "Bene" <benedikt.haeu...@outlook.com<mailto:benedikt.hae

Re: Assign values to existing column in SparkR

2016-09-10 Thread Felix Cheung
2:29 PM Subject: Re: Assign values to existing column in SparkR To: xingye <xing_ma...@hotmail.com<mailto:xing_ma...@hotmail.com>> Cc: <user@spark.apache.org<mailto:user@spark.apache.org>> Data frames are immutable in nature , so i don't think you can directly assign or change

Re: SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Bene
essage in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. ---

Re: SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Felix Cheung
Could you include code snippets you are running? On Sat, Sep 10, 2016 at 1:44 AM -0700, "Bene" <benedikt.haeu...@outlook.com<mailto:benedikt.haeu...@outlook.com>> wrote: Hi, I am having a problem with the SparkR API. I need to subset a distributed data so I can extr

SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Bene
Hi, I am having a problem with the SparkR API. I need to subset a distributed data so I can extract single values from it on which I can then do calculations. Each row of my df has two integer values, I am creating a vector of new values calculated as a series of sin, cos, tan functions

Re: SparkR error: reference is ambiguous.

2016-09-09 Thread Bedrytski Aliaksandr
Hi, Can you use full-string queries in SparkR? Like (in Scala): df1.registerTempTable("df1") df2.registerTempTable("df2") val df3 = sparkContext.sql("SELECT * FROM df1 JOIN df2 ON df1.ra = df2.ra") explicitly mentioning table names in the query often solve

Re: Assign values to existing column in SparkR

2016-09-09 Thread Deepak Sharma
Data frames are immutable in nature , so i don't think you can directly assign or change values on the column. Thanks Deepak On Fri, Sep 9, 2016 at 10:59 PM, xingye wrote: > I have some questions about assign values to a spark dataframe. I want to > assign values to an

SparkR error: reference is ambiguous.

2016-09-09 Thread xingye
Not sure whether this is the right distribution list that I can ask questions. If not, can someone give a distribution list that can find someone to help?I kept getting error of reference is ambiguous when implementing some sparkR code.1. when i tried to assign values to a column using

Assign values to existing column in SparkR

2016-09-09 Thread xingye
I have some questions about assign values to a spark dataframe. I want to assign values to an existing column of a spark dataframe but if I assign the value directly, I got the following error.df$c_mon<-0Error: class(value) == "Column" || is.null(value) is not TRUEIs there a way to solve this?

Re: No SparkR on Mesos?

2016-09-08 Thread ray
Hi, Rodrick, Interesting. SparkR is expected not to work with Mesos due to lack of support for mesos in some places, and it has not been tested yet. Have you modified Spark source code by yourself? Have you deployed Spark binary distribution on all salve nodes, and set

Re: No SparkR on Mesos?

2016-09-07 Thread Rodrick Brown
We've been using SparkR on Mesos for quite sometime with no issues. [fedora@prod-rstudio-1 ~]$ /opt/spark-1.6.1/bin/sparkR R version 3.3.0 (2016-05-03) -- "Supposedly Educational" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-bit)

Re: No SparkR on Mesos?

2016-09-07 Thread Timothy Chen
le, which turned out to mean there was a silly conditional that would fail > the submission, even though all the support was there. Could be the same for > R. Can you submit a JIRA? > >> On Wed, Sep 7, 2016 at 5:02 AM, Peter Griessl <grie...@ihs.ac.at> wrote: >> Hello, >> &

Re: No SparkR on Mesos?

2016-09-07 Thread Felix Cheung
This is correct - SparkR is not quite working completely on Mesos. JIRAs and contributions welcome! On Wed, Sep 7, 2016 at 10:21 AM -0700, "Michael Gummelt" <mgumm...@mesosphere.io<mailto:mgumm...@mesosphere.io>> wrote: Quite possibly. I've never used it. I know P

Re: No SparkR on Mesos?

2016-09-07 Thread Michael Gummelt
AM, Peter Griessl <grie...@ihs.ac.at> wrote: > Hello, > > > > does SparkR really not work (yet?) on Mesos (Spark 2.0 on Mesos 1.0)? > > > > $ /opt/spark/bin/sparkR > > > > R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" > > Copyright (

No SparkR on Mesos?

2016-09-07 Thread Peter Griessl
Hello, does SparkR really not work (yet?) on Mesos (Spark 2.0 on Mesos 1.0)? $ /opt/spark/bin/sparkR R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) Launching java with spark-subm

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Felix Cheung
The reason your second example works is because of a closure capture behavior. It should be ok for a small amount of data. You could also use SparkR:::broadcast but please keep in mind that is not public API we actively support. Thank you for the information on formula - I will test that out

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Cinquegrana, Piero
I tested both in local and cluster mode and the '<<-' seemed to work at least for small data. Or am I missing something? Is there a way for me to test? If that does not work, can I use something like this? sc <- SparkR:::getSparkContext() bcStack <- SparkR:::broadcast(sc,stack)

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Felix Cheung
Cinquegrana, Piero <piero.cinquegr...@neustar.biz<mailto:piero.cinquegr...@neustar.biz>> Sent: Wednesday, August 24, 2016 10:37 AM Subject: RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") To: Cinquegrana, Piero <piero.cinquegr...@neustar.biz<mailto:

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-24 Thread Cinquegrana, Piero
day, August 23, 2016 2:39 PM To: Felix Cheung <felixcheun...@hotmail.com>; user@spark.apache.org Subject: RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") The output from score() is very small, just a float. The input, however, could be as big as several hundre

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-23 Thread Cinquegrana, Piero
, Piero <piero.cinquegr...@neustar.biz>; user@spark.apache.org Subject: Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, &qu

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Felix Cheung
How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, "Cinquegrana, Piero" <piero.cinquegr...@neustar.biz<mailto:piero.cinquegr...@neustar.biz>> wrote: Hello, I am using the new R API

spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Cinquegrana, Piero
Hello, I am using the new R API in SparkR spark.lapply (spark 2.0). I am defining a complex function to be run across executors and I have to send the entire dataset, but there is not (that I could find) a way to broadcast the variable in SparkR. I am thus reading the dataset in each executor

Re: Disable logger in SparkR

2016-08-22 Thread Felix Cheung
nformy...@gmail.com>> Sent: Monday, August 22, 2016 6:12 AM Subject: Disable logger in SparkR To: user <user@spark.apache.org<mailto:user@spark.apache.org>> Hi, Is there any way of disabling the logging on console in S

Disable logger in SparkR

2016-08-22 Thread Yogesh Vyas
Hi, Is there any way of disabling the logging on console in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: UDF in SparkR

2016-08-17 Thread Yann-Aël Le Borgne
e the API > doc: > https://spark.apache.org/docs/2.0.0/api/R/ > > Feedback welcome and appreciated! > > > _ > From: Yogesh Vyas <informy...@gmail.com> > Sent: Tuesday, August 16, 2016 11:39 PM > Subject: UDF in SparkR > To: user <use

Re: UDF in SparkR

2016-08-17 Thread Felix Cheung
016 11:39 PM Subject: UDF in SparkR To: user <user@spark.apache.org<mailto:user@spark.apache.org>> Hi, Is there is any way of using UDF in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spar

UDF in SparkR

2016-08-17 Thread Yogesh Vyas
Hi, Is there is any way of using UDF in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: SparkR error when repartition is called

2016-08-09 Thread Felix Cheung
nvalid>> Sent: Tuesday, August 9, 2016 12:19 AM Subject: Re: SparkR error when repartition is called To: Sun Rui <sunrise_...@163.com<mailto:sunrise_...@163.com>> Cc: User <user@spark.apache.org<mailto:user@spark.apache.org>> Sun, I am using spark in yarn client mode i

Re: SparkR error when repartition is called

2016-08-09 Thread Shane Lee
e.Could you give more environment information? On Aug 9, 2016, at 11:35, Shane Lee <shane_y_...@yahoo.com.INVALID> wrote: Hi All, I am trying out SparkR 2.0 and have run into an issue with repartition.  Here is the R code (essentially a port of the pi-calculating scala example in the s

Re: SparkR error when repartition is called

2016-08-09 Thread Sun Rui
I can’t reproduce your issue with len=1 in local mode. Could you give more environment information? > On Aug 9, 2016, at 11:35, Shane Lee <shane_y_...@yahoo.com.INVALID> wrote: > > Hi All, > > I am trying out SparkR 2.0 and have run into an issue with repartition. &g

SparkR error when repartition is called

2016-08-08 Thread Shane Lee
Hi All, I am trying out SparkR 2.0 and have run into an issue with repartition.  Here is the R code (essentially a port of the pi-calculating scala example in the spark package) that can reproduce the behavior: schema <- structType(structField("input", "integer"), 

Re: How to partition a SparkDataFrame using all distinct column values in sparkR

2016-08-03 Thread Sun Rui
SparkDataFrame.repartition() uses hash partitioning, it can guarantee that all rows of the same column value go to the same partition, but it does not guarantee that each partition contain only single column value. Fortunately, Spark 2.0 comes with gapply() in SparkR. You can apply an R

How to partition a SparkDataFrame using all distinct column values in sparkR

2016-07-25 Thread Neil Chang
Hi, This is a question regarding SparkR in spark 2.0. Given that I have a SparkDataFrame and I want to partition it using one column's values. Each value corresponds to a partition, all rows that having the same column value shall go to the same partition, no more no less. Seems

Re: XLConnect in SparkR

2016-07-21 Thread Marco Mistroni
tion only > supports reading from local file path, so you might need a way to call HDFS > command to get the file from HDFS first. > > SparkR currently does not support this - you could read it in as a text > file (I don't think .xlsx is a text format though), collect to get all

Re: XLConnect in SparkR

2016-07-20 Thread Felix Cheung
>From looking at be CLConnect package, its loadWorkbook() function only >supports reading from local file path, so you might need a way to call HDFS >command to get the file from HDFS first. SparkR currently does not support this - you could read it in as a text file (I don't th

  1   2   3   4   5   >