Re: SparkR API problem with subsetting distributed data frame
I am calling dirs(x, dat) with a number for x and a distributed dataframe for dat, like dirs(3, df). With your logical expression Felix I would get another data frame, right? This is not what I need, I need to extract a single value in a specific cell for my calculations. Is that somehow possible? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27692.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: SparkR API problem with subsetting distributed data frame
How are you calling dirs()? What would be x? Is dat a SparkDataFrame? With SparkR, i in dat[i, 4] should be an logical expression for row, eg. df[df$age %in% c(19, 30), 1:2] On Sat, Sep 10, 2016 at 11:02 AM -0700, "Bene"> wrote: Here are a few code snippets: The data frame looks like this: kfzzeit datum latitude longitude 1 # 2015-02-09 07:18:33 2015-02-09 52.35234 9.881965 2 # 2015-02-09 07:18:34 2015-02-09 52.35233 9.881970 3 # 2015-02-09 07:18:35 2015-02-09 52.35232 9.881975 4 # 2015-02-09 07:18:36 2015-02-09 52.35232 9.881972 5 # 2015-02-09 07:18:37 2015-02-09 52.35231 9.881973 6 # 2015-02-09 07:18:38 2015-02-09 52.35231 9.881978 I call this function with a number (position in the data frame) and a data frame: dirs <- function(x, dat){ direction(startLat = dat[x,4], endLat = dat[x+1,4], startLon = dat[x,5], endLon = dat[x+1,5]) } Here I get the error with the S4 class not subsettable. This function calls another function which does the actual calculation: direction <- function(startLat, endLat, startLon, endLon){ startLat <- degrees.to.radians(startLat); startLon <- degrees.to.radians(startLon); endLat <- degrees.to.radians(endLat); endLon <- degrees.to.radians(endLon); dLon <- endLon - startLon; dPhi <- log(tan(endLat / 2 + pi / 4) / tan(startLat / 2 + pi / 4)); if (abs(dLon) > pi) { if (dLon > 0) { dLon <- -(2 * pi - dLon); } else { dLon <- (2 * pi + dLon); } } bearing <- radians.to.degrees((atan2(dLon, dPhi) + 360 )) %% 360; return (bearing); } Anything more you need? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: SparkR API problem with subsetting distributed data frame
Here are a few code snippets: The data frame looks like this: kfzzeit datum latitude longitude 1 # 2015-02-09 07:18:33 2015-02-09 52.35234 9.881965 2 # 2015-02-09 07:18:34 2015-02-09 52.35233 9.881970 3 # 2015-02-09 07:18:35 2015-02-09 52.35232 9.881975 4 # 2015-02-09 07:18:36 2015-02-09 52.35232 9.881972 5 # 2015-02-09 07:18:37 2015-02-09 52.35231 9.881973 6 # 2015-02-09 07:18:38 2015-02-09 52.35231 9.881978 I call this function with a number (position in the data frame) and a data frame: dirs <- function(x, dat){ direction(startLat = dat[x,4], endLat = dat[x+1,4], startLon = dat[x,5], endLon = dat[x+1,5]) } Here I get the error with the S4 class not subsettable. This function calls another function which does the actual calculation: direction <- function(startLat, endLat, startLon, endLon){ startLat <- degrees.to.radians(startLat); startLon <- degrees.to.radians(startLon); endLat <- degrees.to.radians(endLat); endLon <- degrees.to.radians(endLon); dLon <- endLon - startLon; dPhi <- log(tan(endLat / 2 + pi / 4) / tan(startLat / 2 + pi / 4)); if (abs(dLon) > pi) { if (dLon > 0) { dLon <- -(2 * pi - dLon); } else { dLon <- (2 * pi + dLon); } } bearing <- radians.to.degrees((atan2(dLon, dPhi) + 360 )) %% 360; return (bearing); } Anything more you need? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: SparkR API problem with subsetting distributed data frame
Could you include code snippets you are running? On Sat, Sep 10, 2016 at 1:44 AM -0700, "Bene"> wrote: Hi, I am having a problem with the SparkR API. I need to subset a distributed data so I can extract single values from it on which I can then do calculations. Each row of my df has two integer values, I am creating a vector of new values calculated as a series of sin, cos, tan functions on these two values. Does anyone have an idea how to do this in SparkR? So far I tried subsetting with [], [[]], subset(), but mostly I get the error object of type 'S4' is not subsettable Is there any way to do such a thing in SparkR? Any help would be greatly appreciated! Also let me know if you need more information, code etc. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org