Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/column_datetime_functions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/column_datetime_functions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/column_datetime_functions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,402 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Date time functions for Column operations</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for column_datetime_functions {SparkR}"><tr><td>column_datetime_functions {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Date time functions for Column operations</h2> + +<h3>Description</h3> + +<p>Date time functions defined for <code>Column</code>. +</p> + + +<h3>Usage</h3> + +<pre> +current_date(x = "missing") + +current_timestamp(x = "missing") + +date_trunc(format, x) + +dayofmonth(x) + +dayofweek(x) + +dayofyear(x) + +from_unixtime(x, ...) + +hour(x) + +last_day(x) + +minute(x) + +month(x) + +quarter(x) + +second(x) + +to_date(x, format) + +to_timestamp(x, format) + +unix_timestamp(x, format) + +weekofyear(x) + +window(x, ...) + +year(x) + +## S4 method for signature 'Column' +dayofmonth(x) + +## S4 method for signature 'Column' +dayofweek(x) + +## S4 method for signature 'Column' +dayofyear(x) + +## S4 method for signature 'Column' +hour(x) + +## S4 method for signature 'Column' +last_day(x) + +## S4 method for signature 'Column' +minute(x) + +## S4 method for signature 'Column' +month(x) + +## S4 method for signature 'Column' +quarter(x) + +## S4 method for signature 'Column' +second(x) + +## S4 method for signature 'Column,missing' +to_date(x, format) + +## S4 method for signature 'Column,character' +to_date(x, format) + +## S4 method for signature 'Column,missing' +to_timestamp(x, format) + +## S4 method for signature 'Column,character' +to_timestamp(x, format) + +## S4 method for signature 'Column' +weekofyear(x) + +## S4 method for signature 'Column' +year(x) + +## S4 method for signature 'Column' +from_unixtime(x, format = "yyyy-MM-dd HH:mm:ss") + +## S4 method for signature 'Column' +window(x, windowDuration, slideDuration = NULL, + startTime = NULL) + +## S4 method for signature 'missing,missing' +unix_timestamp(x, format) + +## S4 method for signature 'Column,missing' +unix_timestamp(x, format) + +## S4 method for signature 'Column,character' +unix_timestamp(x, format = "yyyy-MM-dd HH:mm:ss") + +## S4 method for signature 'Column' +trunc(x, format) + +## S4 method for signature 'character,Column' +date_trunc(format, x) + +## S4 method for signature 'missing' +current_date() + +## S4 method for signature 'missing' +current_timestamp() +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>Column to compute on. In <code>window</code>, it must be a time Column of +<code>TimestampType</code>. This is not used with <code>current_date</code> and +<code>current_timestamp</code></p> +</td></tr> +<tr valign="top"><td><code>format</code></td> +<td> +<p>The format for the given dates or timestamps in Column <code>x</code>. See the +format used in the following methods: +</p> + +<ul> +<li> <p><code>to_date</code> and <code>to_timestamp</code>: it is the string to use to parse +Column <code>x</code> to DateType or TimestampType. +</p> +</li> +<li> <p><code>trunc</code>: it is the string to use to specify the truncation method. +For example, "year", "yyyy", "yy" for truncate by year, or "month", "mon", +"mm" for truncate by month. +</p> +</li> +<li> <p><code>date_trunc</code>: it is similar with <code>trunc</code>'s but additionally +supports "day", "dd", "second", "minute", "hour", "week" and "quarter". +</p> +</li></ul> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s).</p> +</td></tr> +<tr valign="top"><td><code>windowDuration</code></td> +<td> +<p>a string specifying the width of the window, e.g. '1 second', +'1 day 12 hours', '2 minutes'. Valid interval strings are 'week', +'day', 'hour', 'minute', 'second', 'millisecond', 'microsecond'. Note that +the duration is a fixed length of time, and does not vary over time +according to a calendar. For example, '1 day' always means 86,400,000 +milliseconds, not a calendar day.</p> +</td></tr> +<tr valign="top"><td><code>slideDuration</code></td> +<td> +<p>a string specifying the sliding interval of the window. Same format as +<code>windowDuration</code>. A new window will be generated every +<code>slideDuration</code>. Must be less than or equal to +the <code>windowDuration</code>. This duration is likewise absolute, and does not +vary according to a calendar.</p> +</td></tr> +<tr valign="top"><td><code>startTime</code></td> +<td> +<p>the offset with respect to 1970-01-01 00:00:00 UTC with which to start +window intervals. For example, in order to have hourly tumbling windows +that start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide +<code>startTime</code> as <code>"15 minutes"</code>.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>dayofmonth</code>: Extracts the day of the month as an integer from a +given date/timestamp/string. +</p> +<p><code>dayofweek</code>: Extracts the day of the week as an integer from a +given date/timestamp/string. +</p> +<p><code>dayofyear</code>: Extracts the day of the year as an integer from a +given date/timestamp/string. +</p> +<p><code>hour</code>: Extracts the hour as an integer from a given date/timestamp/string. +</p> +<p><code>last_day</code>: Given a date column, returns the last day of the month which the +given date belongs to. For example, input "2015-07-27" returns "2015-07-31" since +July 31 is the last day of the month in July 2015. +</p> +<p><code>minute</code>: Extracts the minute as an integer from a given date/timestamp/string. +</p> +<p><code>month</code>: Extracts the month as an integer from a given date/timestamp/string. +</p> +<p><code>quarter</code>: Extracts the quarter as an integer from a given date/timestamp/string. +</p> +<p><code>second</code>: Extracts the second as an integer from a given date/timestamp/string. +</p> +<p><code>to_date</code>: Converts the column into a DateType. You may optionally specify +a format according to the rules in: +<a href="http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html">http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html</a>. +If the string cannot be parsed according to the specified format (or default), +the value of the column will be null. +By default, it follows casting rules to a DateType if the format is omitted +(equivalent to <code>cast(df$x, "date")</code>). +</p> +<p><code>to_timestamp</code>: Converts the column into a TimestampType. You may optionally specify +a format according to the rules in: +<a href="http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html">http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html</a>. +If the string cannot be parsed according to the specified format (or default), +the value of the column will be null. +By default, it follows casting rules to a TimestampType if the format is omitted +(equivalent to <code>cast(df$x, "timestamp")</code>). +</p> +<p><code>weekofyear</code>: Extracts the week number as an integer from a given date/timestamp/string. +</p> +<p><code>year</code>: Extracts the year as an integer from a given date/timestamp/string. +</p> +<p><code>from_unixtime</code>: Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) +to a string representing the timestamp of that moment in the current system time zone in the JVM +in the given format. +See <a href="http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html"> +Customizing Formats</a> for available options. +</p> +<p><code>window</code>: Bucketizes rows into one or more time windows given a timestamp specifying column. +Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window +[12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in +the order of months are not supported. It returns an output column of struct called 'window' +by default with the nested columns 'start' and 'end' +</p> +<p><code>unix_timestamp</code>: Gets current Unix timestamp in seconds. +</p> +<p><code>trunc</code>: Returns date truncated to the unit specified by the format. +</p> +<p><code>date_trunc</code>: Returns timestamp truncated to the unit specified by the format. +</p> +<p><code>current_date</code>: Returns the current date as a date column. +</p> +<p><code>current_timestamp</code>: Returns the current timestamp as a timestamp column. +</p> + + +<h3>Note</h3> + +<p>dayofmonth since 1.5.0 +</p> +<p>dayofweek since 2.3.0 +</p> +<p>dayofyear since 1.5.0 +</p> +<p>hour since 1.5.0 +</p> +<p>last_day since 1.5.0 +</p> +<p>minute since 1.5.0 +</p> +<p>month since 1.5.0 +</p> +<p>quarter since 1.5.0 +</p> +<p>second since 1.5.0 +</p> +<p>to_date(Column) since 1.5.0 +</p> +<p>to_date(Column, character) since 2.2.0 +</p> +<p>to_timestamp(Column) since 2.2.0 +</p> +<p>to_timestamp(Column, character) since 2.2.0 +</p> +<p>weekofyear since 1.5.0 +</p> +<p>year since 1.5.0 +</p> +<p>from_unixtime since 1.5.0 +</p> +<p>window since 2.0.0 +</p> +<p>unix_timestamp since 1.5.0 +</p> +<p>unix_timestamp(Column) since 1.5.0 +</p> +<p>unix_timestamp(Column, character) since 1.5.0 +</p> +<p>trunc since 2.3.0 +</p> +<p>date_trunc since 2.3.0 +</p> +<p>current_date since 2.3.0 +</p> +<p>current_timestamp since 2.3.0 +</p> + + +<h3>See Also</h3> + +<p>Other data time functions: <code><a href="column_datetime_diff_functions.html">column_datetime_diff_functions</a></code> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D dts <- c("2005-01-02 18:47:22", +##D "2005-12-24 16:30:58", +##D "2005-10-28 07:30:05", +##D "2005-12-28 07:01:05", +##D "2006-01-24 00:01:10") +##D y <- c(2.0, 2.2, 3.4, 2.5, 1.8) +##D df <- createDataFrame(data.frame(time = as.POSIXct(dts), y = y)) +## End(Not run) + +## Not run: +##D head(select(df, df$time, year(df$time), quarter(df$time), month(df$time), +##D dayofmonth(df$time), dayofweek(df$time), dayofyear(df$time), weekofyear(df$time))) +##D head(agg(groupBy(df, year(df$time)), count(df$y), avg(df$y))) +##D head(agg(groupBy(df, month(df$time)), avg(df$y))) +## End(Not run) + +## Not run: +##D head(select(df, hour(df$time), minute(df$time), second(df$time))) +##D head(agg(groupBy(df, dayofmonth(df$time)), avg(df$y))) +##D head(agg(groupBy(df, hour(df$time)), avg(df$y))) +##D head(agg(groupBy(df, minute(df$time)), avg(df$y))) +## End(Not run) + +## Not run: +##D head(select(df, df$time, last_day(df$time), month(df$time))) +## End(Not run) + +## Not run: +##D tmp <- createDataFrame(data.frame(time_string = dts)) +##D tmp2 <- mutate(tmp, date1 = to_date(tmp$time_string), +##D date2 = to_date(tmp$time_string, "yyyy-MM-dd"), +##D date3 = date_format(tmp$time_string, "MM/dd/yyy"), +##D time1 = to_timestamp(tmp$time_string), +##D time2 = to_timestamp(tmp$time_string, "yyyy-MM-dd")) +##D head(tmp2) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, to_unix = unix_timestamp(df$time), +##D to_unix2 = unix_timestamp(df$time, 'yyyy-MM-dd HH'), +##D from_unix = from_unixtime(unix_timestamp(df$time)), +##D from_unix2 = from_unixtime(unix_timestamp(df$time), 'yyyy-MM-dd HH:mm')) +##D head(tmp) +## End(Not run) + +## Not run: +##D # One minute windows every 15 seconds 10 seconds after the minute, e.g. 09:00:10-09:01:10, +##D # 09:00:25-09:01:25, 09:00:40-09:01:40, ... +##D window(df$time, "1 minute", "15 seconds", "10 seconds") +##D +##D # One minute tumbling windows 15 seconds after the minute, e.g. 09:00:15-09:01:15, +##D # 09:01:15-09:02:15... +##D window(df$time, "1 minute", startTime = "15 seconds") +##D +##D # Thirty-second windows every 10 seconds, e.g. 09:00:00-09:00:30, 09:00:10-09:00:40, ... +##D window(df$time, "30 seconds", "10 seconds") +## End(Not run) + +## Not run: +##D head(select(df, df$time, trunc(df$time, "year"), trunc(df$time, "yy"), +##D trunc(df$time, "month"), trunc(df$time, "mon"))) +## End(Not run) + +## Not run: +##D head(select(df, df$time, date_trunc("hour", df$time), date_trunc("minute", df$time), +##D date_trunc("week", df$time), date_trunc("quarter", df$time))) +## End(Not run) +## Not run: +##D head(select(df, current_date(), current_timestamp())) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html>
Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/column_math_functions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/column_math_functions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/column_math_functions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,407 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Math functions for Column operations</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for column_math_functions {SparkR}"><tr><td>column_math_functions {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Math functions for Column operations</h2> + +<h3>Description</h3> + +<p>Math functions defined for <code>Column</code>. +</p> + + +<h3>Usage</h3> + +<pre> +bin(x) + +bround(x, ...) + +cbrt(x) + +ceil(x) + +conv(x, fromBase, toBase) + +hex(x) + +hypot(y, x) + +pmod(y, x) + +rint(x) + +shiftLeft(y, x) + +shiftRight(y, x) + +shiftRightUnsigned(y, x) + +signum(x) + +toDegrees(x) + +toRadians(x) + +unhex(x) + +## S4 method for signature 'Column' +abs(x) + +## S4 method for signature 'Column' +acos(x) + +## S4 method for signature 'Column' +asin(x) + +## S4 method for signature 'Column' +atan(x) + +## S4 method for signature 'Column' +bin(x) + +## S4 method for signature 'Column' +cbrt(x) + +## S4 method for signature 'Column' +ceil(x) + +## S4 method for signature 'Column' +ceiling(x) + +## S4 method for signature 'Column' +cos(x) + +## S4 method for signature 'Column' +cosh(x) + +## S4 method for signature 'Column' +exp(x) + +## S4 method for signature 'Column' +expm1(x) + +## S4 method for signature 'Column' +factorial(x) + +## S4 method for signature 'Column' +floor(x) + +## S4 method for signature 'Column' +hex(x) + +## S4 method for signature 'Column' +log(x) + +## S4 method for signature 'Column' +log10(x) + +## S4 method for signature 'Column' +log1p(x) + +## S4 method for signature 'Column' +log2(x) + +## S4 method for signature 'Column' +rint(x) + +## S4 method for signature 'Column' +round(x) + +## S4 method for signature 'Column' +bround(x, scale = 0) + +## S4 method for signature 'Column' +signum(x) + +## S4 method for signature 'Column' +sign(x) + +## S4 method for signature 'Column' +sin(x) + +## S4 method for signature 'Column' +sinh(x) + +## S4 method for signature 'Column' +sqrt(x) + +## S4 method for signature 'Column' +tan(x) + +## S4 method for signature 'Column' +tanh(x) + +## S4 method for signature 'Column' +toDegrees(x) + +## S4 method for signature 'Column' +toRadians(x) + +## S4 method for signature 'Column' +unhex(x) + +## S4 method for signature 'Column' +atan2(y, x) + +## S4 method for signature 'Column' +hypot(y, x) + +## S4 method for signature 'Column' +pmod(y, x) + +## S4 method for signature 'Column,numeric' +shiftLeft(y, x) + +## S4 method for signature 'Column,numeric' +shiftRight(y, x) + +## S4 method for signature 'Column,numeric' +shiftRightUnsigned(y, x) + +## S4 method for signature 'Column,numeric,numeric' +conv(x, fromBase, toBase) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>Column to compute on. In <code>shiftLeft</code>, <code>shiftRight</code> and +<code>shiftRightUnsigned</code>, this is the number of bits to shift.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s).</p> +</td></tr> +<tr valign="top"><td><code>fromBase</code></td> +<td> +<p>base to convert from.</p> +</td></tr> +<tr valign="top"><td><code>toBase</code></td> +<td> +<p>base to convert to.</p> +</td></tr> +<tr valign="top"><td><code>y</code></td> +<td> +<p>Column to compute on.</p> +</td></tr> +<tr valign="top"><td><code>scale</code></td> +<td> +<p>round to <code>scale</code> digits to the right of the decimal point when +<code>scale</code> > 0, the nearest even number when <code>scale</code> = 0, and <code>scale</code> digits +to the left of the decimal point when <code>scale</code> < 0.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>abs</code>: Computes the absolute value. +</p> +<p><code>acos</code>: Computes the cosine inverse of the given value; the returned angle is in +the range 0.0 through pi. +</p> +<p><code>asin</code>: Computes the sine inverse of the given value; the returned angle is in +the range -pi/2 through pi/2. +</p> +<p><code>atan</code>: Computes the tangent inverse of the given value; the returned angle is in the range +-pi/2 through pi/2. +</p> +<p><code>bin</code>: Returns the string representation of the binary value +of the given long column. For example, bin("12") returns "1100". +</p> +<p><code>cbrt</code>: Computes the cube-root of the given value. +</p> +<p><code>ceil</code>: Computes the ceiling of the given value. +</p> +<p><code>ceiling</code>: Alias for <code>ceil</code>. +</p> +<p><code>cos</code>: Computes the cosine of the given value. Units in radians. +</p> +<p><code>cosh</code>: Computes the hyperbolic cosine of the given value. +</p> +<p><code>exp</code>: Computes the exponential of the given value. +</p> +<p><code>expm1</code>: Computes the exponential of the given value minus one. +</p> +<p><code>factorial</code>: Computes the factorial of the given value. +</p> +<p><code>floor</code>: Computes the floor of the given value. +</p> +<p><code>hex</code>: Computes hex value of the given column. +</p> +<p><code>log</code>: Computes the natural logarithm of the given value. +</p> +<p><code>log10</code>: Computes the logarithm of the given value in base 10. +</p> +<p><code>log1p</code>: Computes the natural logarithm of the given value plus one. +</p> +<p><code>log2</code>: Computes the logarithm of the given column in base 2. +</p> +<p><code>rint</code>: Returns the double value that is closest in value to the argument and +is equal to a mathematical integer. +</p> +<p><code>round</code>: Returns the value of the column rounded to 0 decimal places +using HALF_UP rounding mode. +</p> +<p><code>bround</code>: Returns the value of the column <code>e</code> rounded to <code>scale</code> decimal places +using HALF_EVEN rounding mode if <code>scale</code> >= 0 or at integer part when <code>scale</code> < 0. +Also known as Gaussian rounding or bankers' rounding that rounds to the nearest even number. +bround(2.5, 0) = 2, bround(3.5, 0) = 4. +</p> +<p><code>signum</code>: Computes the signum of the given value. +</p> +<p><code>sign</code>: Alias for <code>signum</code>. +</p> +<p><code>sin</code>: Computes the sine of the given value. Units in radians. +</p> +<p><code>sinh</code>: Computes the hyperbolic sine of the given value. +</p> +<p><code>sqrt</code>: Computes the square root of the specified float value. +</p> +<p><code>tan</code>: Computes the tangent of the given value. Units in radians. +</p> +<p><code>tanh</code>: Computes the hyperbolic tangent of the given value. +</p> +<p><code>toDegrees</code>: Converts an angle measured in radians to an approximately equivalent angle +measured in degrees. +</p> +<p><code>toRadians</code>: Converts an angle measured in degrees to an approximately equivalent angle +measured in radians. +</p> +<p><code>unhex</code>: Inverse of hex. Interprets each pair of characters as a hexadecimal number +and converts to the byte representation of number. +</p> +<p><code>atan2</code>: Returns the angle theta from the conversion of rectangular coordinates +(x, y) to polar coordinates (r, theta). Units in radians. +</p> +<p><code>hypot</code>: Computes "sqrt(a^2 + b^2)" without intermediate overflow or underflow. +</p> +<p><code>pmod</code>: Returns the positive value of dividend mod divisor. +Column <code>x</code> is divisor column, and column <code>y</code> is the dividend column. +</p> +<p><code>shiftLeft</code>: Shifts the given value numBits left. If the given value is a long value, +this function will return a long value else it will return an integer value. +</p> +<p><code>shiftRight</code>: (Signed) shifts the given value numBits right. If the given value is a long +value, it will return a long value else it will return an integer value. +</p> +<p><code>shiftRightUnsigned</code>: (Unigned) shifts the given value numBits right. If the given value is +a long value, it will return a long value else it will return an integer value. +</p> +<p><code>conv</code>: Converts a number in a string column from one base to another. +</p> + + +<h3>Note</h3> + +<p>abs since 1.5.0 +</p> +<p>acos since 1.5.0 +</p> +<p>asin since 1.5.0 +</p> +<p>atan since 1.5.0 +</p> +<p>bin since 1.5.0 +</p> +<p>cbrt since 1.4.0 +</p> +<p>ceil since 1.5.0 +</p> +<p>ceiling since 1.5.0 +</p> +<p>cos since 1.5.0 +</p> +<p>cosh since 1.5.0 +</p> +<p>exp since 1.5.0 +</p> +<p>expm1 since 1.5.0 +</p> +<p>factorial since 1.5.0 +</p> +<p>floor since 1.5.0 +</p> +<p>hex since 1.5.0 +</p> +<p>log since 1.5.0 +</p> +<p>log10 since 1.5.0 +</p> +<p>log1p since 1.5.0 +</p> +<p>log2 since 1.5.0 +</p> +<p>rint since 1.5.0 +</p> +<p>round since 1.5.0 +</p> +<p>bround since 2.0.0 +</p> +<p>signum since 1.5.0 +</p> +<p>sign since 1.5.0 +</p> +<p>sin since 1.5.0 +</p> +<p>sinh since 1.5.0 +</p> +<p>sqrt since 1.5.0 +</p> +<p>tan since 1.5.0 +</p> +<p>tanh since 1.5.0 +</p> +<p>toDegrees since 1.4.0 +</p> +<p>toRadians since 1.4.0 +</p> +<p>unhex since 1.5.0 +</p> +<p>atan2 since 1.5.0 +</p> +<p>hypot since 1.4.0 +</p> +<p>pmod since 1.5.0 +</p> +<p>shiftLeft since 1.5.0 +</p> +<p>shiftRight since 1.5.0 +</p> +<p>shiftRightUnsigned since 1.5.0 +</p> +<p>conv since 1.5.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D # Dataframe used throughout this doc +##D df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)) +##D tmp <- mutate(df, v1 = log(df$mpg), v2 = cbrt(df$disp), +##D v3 = bround(df$wt, 1), v4 = bin(df$cyl), +##D v5 = hex(df$wt), v6 = toDegrees(df$gear), +##D v7 = atan2(df$cyl, df$am), v8 = hypot(df$cyl, df$am), +##D v9 = pmod(df$hp, df$cyl), v10 = shiftLeft(df$disp, 1), +##D v11 = conv(df$hp, 10, 16), v12 = sign(df$vs - 0.5), +##D v13 = sqrt(df$disp), v14 = ceil(df$wt)) +##D head(tmp) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/column_misc_functions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/column_misc_functions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/column_misc_functions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,117 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Miscellaneous functions for Column operations</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for column_misc_functions {SparkR}"><tr><td>column_misc_functions {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Miscellaneous functions for Column operations</h2> + +<h3>Description</h3> + +<p>Miscellaneous functions defined for <code>Column</code>. +</p> + + +<h3>Usage</h3> + +<pre> +crc32(x) + +hash(x, ...) + +md5(x) + +sha1(x) + +sha2(y, x) + +## S4 method for signature 'Column' +crc32(x) + +## S4 method for signature 'Column' +hash(x, ...) + +## S4 method for signature 'Column' +md5(x) + +## S4 method for signature 'Column' +sha1(x) + +## S4 method for signature 'Column,numeric' +sha2(y, x) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>Column to compute on. In <code>sha2</code>, it is one of 224, 256, 384, or 512.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional Columns.</p> +</td></tr> +<tr valign="top"><td><code>y</code></td> +<td> +<p>Column to compute on.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>crc32</code>: Calculates the cyclic redundancy check value (CRC32) of a binary column +and returns the value as a bigint. +</p> +<p><code>hash</code>: Calculates the hash code of given columns, and returns the result +as an int column. +</p> +<p><code>md5</code>: Calculates the MD5 digest of a binary column and returns the value +as a 32 character hex string. +</p> +<p><code>sha1</code>: Calculates the SHA-1 digest of a binary column and returns the value +as a 40 character hex string. +</p> +<p><code>sha2</code>: Calculates the SHA-2 family of hash functions of a binary column and +returns the value as a hex string. The second argument <code>x</code> specifies the number +of bits, and is one of 224, 256, 384, or 512. +</p> + + +<h3>Note</h3> + +<p>crc32 since 1.5.0 +</p> +<p>hash since 2.0.0 +</p> +<p>md5 since 1.5.0 +</p> +<p>sha1 since 1.5.0 +</p> +<p>sha2 since 1.5.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D # Dataframe used throughout this doc +##D df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)[, 1:2]) +##D tmp <- mutate(df, v1 = crc32(df$model), v2 = hash(df$model), +##D v3 = hash(df$model, df$mpg), v4 = md5(df$model), +##D v5 = sha1(df$model), v6 = sha2(df$model, 256)) +##D head(tmp) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/column_nonaggregate_functions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/column_nonaggregate_functions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/column_nonaggregate_functions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,348 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Non-aggregate functions for Column operations</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for column_nonaggregate_functions {SparkR}"><tr><td>column_nonaggregate_functions {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Non-aggregate functions for Column operations</h2> + +<h3>Description</h3> + +<p>Non-aggregate functions defined for <code>Column</code>. +</p> + + +<h3>Usage</h3> + +<pre> +when(condition, value) + +bitwiseNOT(x) + +create_array(x, ...) + +create_map(x, ...) + +expr(x) + +greatest(x, ...) + +input_file_name(x = "missing") + +isnan(x) + +least(x, ...) + +lit(x) + +monotonically_increasing_id(x = "missing") + +nanvl(y, x) + +negate(x) + +rand(seed) + +randn(seed) + +spark_partition_id(x = "missing") + +struct(x, ...) + +## S4 method for signature 'ANY' +lit(x) + +## S4 method for signature 'Column' +bitwiseNOT(x) + +## S4 method for signature 'Column' +coalesce(x, ...) + +## S4 method for signature 'Column' +isnan(x) + +## S4 method for signature 'Column' +is.nan(x) + +## S4 method for signature 'missing' +monotonically_increasing_id() + +## S4 method for signature 'Column' +negate(x) + +## S4 method for signature 'missing' +spark_partition_id() + +## S4 method for signature 'characterOrColumn' +struct(x, ...) + +## S4 method for signature 'Column' +nanvl(y, x) + +## S4 method for signature 'Column' +greatest(x, ...) + +## S4 method for signature 'Column' +least(x, ...) + +## S4 method for signature 'character' +expr(x) + +## S4 method for signature 'missing' +rand(seed) + +## S4 method for signature 'numeric' +rand(seed) + +## S4 method for signature 'missing' +randn(seed) + +## S4 method for signature 'numeric' +randn(seed) + +## S4 method for signature 'Column' +when(condition, value) + +## S4 method for signature 'Column' +ifelse(test, yes, no) + +## S4 method for signature 'Column' +create_array(x, ...) + +## S4 method for signature 'Column' +create_map(x, ...) + +## S4 method for signature 'missing' +input_file_name() +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>condition</code></td> +<td> +<p>the condition to test on. Must be a Column expression.</p> +</td></tr> +<tr valign="top"><td><code>value</code></td> +<td> +<p>result expression.</p> +</td></tr> +<tr valign="top"><td><code>x</code></td> +<td> +<p>Column to compute on. In <code>lit</code>, it is a literal value or a Column. +In <code>expr</code>, it contains an expression character object to be parsed.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional Columns.</p> +</td></tr> +<tr valign="top"><td><code>y</code></td> +<td> +<p>Column to compute on.</p> +</td></tr> +<tr valign="top"><td><code>seed</code></td> +<td> +<p>a random seed. Can be missing.</p> +</td></tr> +<tr valign="top"><td><code>test</code></td> +<td> +<p>a Column expression that describes the condition.</p> +</td></tr> +<tr valign="top"><td><code>yes</code></td> +<td> +<p>return values for <code>TRUE</code> elements of test.</p> +</td></tr> +<tr valign="top"><td><code>no</code></td> +<td> +<p>return values for <code>FALSE</code> elements of test.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>lit</code>: A new Column is created to represent the literal value. +If the parameter is a Column, it is returned unchanged. +</p> +<p><code>bitwiseNOT</code>: Computes bitwise NOT. +</p> +<p><code>coalesce</code>: Returns the first column that is not NA, or NA if all inputs are. +</p> +<p><code>isnan</code>: Returns true if the column is NaN. +</p> +<p><code>is.nan</code>: Alias for <a href="column_nonaggregate_functions.html">isnan</a>. +</p> +<p><code>monotonically_increasing_id</code>: Returns a column that generates monotonically increasing +64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, +but not consecutive. The current implementation puts the partition ID in the upper 31 bits, +and the record number within each partition in the lower 33 bits. The assumption is that the +SparkDataFrame has less than 1 billion partitions, and each partition has less than 8 billion +records. As an example, consider a SparkDataFrame with two partitions, each with 3 records. +This expression would return the following IDs: +0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594. +This is equivalent to the MONOTONICALLY_INCREASING_ID function in SQL. +The method should be used with no argument. +</p> +<p><code>negate</code>: Unary minus, i.e. negate the expression. +</p> +<p><code>spark_partition_id</code>: Returns the partition ID as a SparkDataFrame column. +Note that this is nondeterministic because it depends on data partitioning and +task scheduling. +This is equivalent to the <code>SPARK_PARTITION_ID</code> function in SQL. +</p> +<p><code>struct</code>: Creates a new struct column that composes multiple input columns. +</p> +<p><code>nanvl</code>: Returns the first column (<code>y</code>) if it is not NaN, or the second column +(<code>x</code>) if the first column is NaN. Both inputs should be floating point columns +(DoubleType or FloatType). +</p> +<p><code>greatest</code>: Returns the greatest value of the list of column names, skipping null values. +This function takes at least 2 parameters. It will return null if all parameters are null. +</p> +<p><code>least</code>: Returns the least value of the list of column names, skipping null values. +This function takes at least 2 parameters. It will return null if all parameters are null. +</p> +<p><code>expr</code>: Parses the expression string into the column that it represents, similar to +<code>SparkDataFrame.selectExpr</code> +</p> +<p><code>rand</code>: Generates a random column with independent and identically distributed (i.i.d.) +samples from U[0.0, 1.0]. +</p> +<p><code>randn</code>: Generates a column with independent and identically distributed (i.i.d.) samples +from the standard normal distribution. +</p> +<p><code>when</code>: Evaluates a list of conditions and returns one of multiple possible result +expressions. For unmatched expressions null is returned. +</p> +<p><code>ifelse</code>: Evaluates a list of conditions and returns <code>yes</code> if the conditions are +satisfied. Otherwise <code>no</code> is returned for unmatched conditions. +</p> +<p><code>create_array</code>: Creates a new array column. The input columns must all have the same data +type. +</p> +<p><code>create_map</code>: Creates a new map column. The input columns must be grouped as key-value +pairs, e.g. (key1, value1, key2, value2, ...). +The key columns must all have the same data type, and can't be null. +The value columns must all have the same data type. +</p> +<p><code>input_file_name</code>: Creates a string column with the input file name for a given row. +The method should be used with no argument. +</p> + + +<h3>Note</h3> + +<p>lit since 1.5.0 +</p> +<p>bitwiseNOT since 1.5.0 +</p> +<p>coalesce(Column) since 2.1.1 +</p> +<p>isnan since 2.0.0 +</p> +<p>is.nan since 2.0.0 +</p> +<p>negate since 1.5.0 +</p> +<p>spark_partition_id since 2.0.0 +</p> +<p>struct since 1.6.0 +</p> +<p>nanvl since 1.5.0 +</p> +<p>greatest since 1.5.0 +</p> +<p>least since 1.5.0 +</p> +<p>expr since 1.5.0 +</p> +<p>rand since 1.5.0 +</p> +<p>rand(numeric) since 1.5.0 +</p> +<p>randn since 1.5.0 +</p> +<p>randn(numeric) since 1.5.0 +</p> +<p>when since 1.5.0 +</p> +<p>ifelse since 1.5.0 +</p> +<p>create_array since 2.3.0 +</p> +<p>create_map since 2.3.0 +</p> +<p>input_file_name since 2.3.0 +</p> + + +<h3>See Also</h3> + +<p>coalesce,SparkDataFrame-method +</p> +<p>Other non-aggregate functions: <code><a href="column.html">column</a></code>, +<code><a href="not.html">not</a></code> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D # Dataframe used throughout this doc +##D df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, v1 = lit(df$mpg), v2 = lit("x"), v3 = lit("2015-01-01"), +##D v4 = negate(df$mpg), v5 = expr('length(model)'), +##D v6 = greatest(df$vs, df$am), v7 = least(df$vs, df$am), +##D v8 = column("mpg")) +##D head(tmp) +## End(Not run) + +## Not run: +##D head(select(df, bitwiseNOT(cast(df$vs, "int")))) +## End(Not run) + +## Not run: head(select(df, monotonically_increasing_id())) + +## Not run: head(select(df, spark_partition_id())) + +## Not run: +##D tmp <- mutate(df, v1 = struct(df$mpg, df$cyl), v2 = struct("hp", "wt", "vs"), +##D v3 = create_array(df$mpg, df$cyl, df$hp), +##D v4 = create_map(lit("x"), lit(1.0), lit("y"), lit(-1.0))) +##D head(tmp) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, r1 = rand(), r2 = rand(10), r3 = randn(), r4 = randn(10)) +##D head(tmp) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, mpg_na = otherwise(when(df$mpg > 20, df$mpg), lit(NaN)), +##D mpg2 = ifelse(df$mpg > 20 & df$am > 0, 0, 1), +##D mpg3 = ifelse(df$mpg > 20, df$mpg, 20.0)) +##D head(tmp) +##D tmp <- mutate(tmp, ind_na1 = is.nan(tmp$mpg_na), ind_na2 = isnan(tmp$mpg_na)) +##D head(select(tmp, coalesce(tmp$mpg_na, tmp$mpg))) +##D head(select(tmp, nanvl(tmp$mpg_na, tmp$hp))) +## End(Not run) + +## Not run: +##D tmp <- read.text("README.md") +##D head(select(tmp, input_file_name())) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/column_string_functions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/column_string_functions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/column_string_functions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,541 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: String functions for Column operations</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for column_string_functions {SparkR}"><tr><td>column_string_functions {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>String functions for Column operations</h2> + +<h3>Description</h3> + +<p>String functions defined for <code>Column</code>. +</p> + + +<h3>Usage</h3> + +<pre> +ascii(x) + +base64(x) + +concat(x, ...) + +concat_ws(sep, x, ...) + +decode(x, charset) + +encode(x, charset) + +format_number(y, x) + +format_string(format, x, ...) + +initcap(x) + +instr(y, x) + +levenshtein(y, x) + +locate(substr, str, ...) + +lower(x) + +lpad(x, len, pad) + +ltrim(x, trimString) + +regexp_extract(x, pattern, idx) + +regexp_replace(x, pattern, replacement) + +repeat_string(x, n) + +reverse(x) + +rpad(x, len, pad) + +rtrim(x, trimString) + +split_string(x, pattern) + +soundex(x) + +substring_index(x, delim, count) + +translate(x, matchingString, replaceString) + +trim(x, trimString) + +unbase64(x) + +upper(x) + +## S4 method for signature 'Column' +ascii(x) + +## S4 method for signature 'Column' +base64(x) + +## S4 method for signature 'Column,character' +decode(x, charset) + +## S4 method for signature 'Column,character' +encode(x, charset) + +## S4 method for signature 'Column' +initcap(x) + +## S4 method for signature 'Column' +length(x) + +## S4 method for signature 'Column' +lower(x) + +## S4 method for signature 'Column,missing' +ltrim(x, trimString) + +## S4 method for signature 'Column,character' +ltrim(x, trimString) + +## S4 method for signature 'Column' +reverse(x) + +## S4 method for signature 'Column,missing' +rtrim(x, trimString) + +## S4 method for signature 'Column,character' +rtrim(x, trimString) + +## S4 method for signature 'Column' +soundex(x) + +## S4 method for signature 'Column,missing' +trim(x, trimString) + +## S4 method for signature 'Column,character' +trim(x, trimString) + +## S4 method for signature 'Column' +unbase64(x) + +## S4 method for signature 'Column' +upper(x) + +## S4 method for signature 'Column' +levenshtein(y, x) + +## S4 method for signature 'Column' +concat(x, ...) + +## S4 method for signature 'Column,character' +instr(y, x) + +## S4 method for signature 'Column,numeric' +format_number(y, x) + +## S4 method for signature 'character,Column' +concat_ws(sep, x, ...) + +## S4 method for signature 'character,Column' +format_string(format, x, ...) + +## S4 method for signature 'character,Column' +locate(substr, str, pos = 1) + +## S4 method for signature 'Column,numeric,character' +lpad(x, len, pad) + +## S4 method for signature 'Column,character,numeric' +regexp_extract(x, pattern, idx) + +## S4 method for signature 'Column,character,character' +regexp_replace(x, pattern, replacement) + +## S4 method for signature 'Column,numeric,character' +rpad(x, len, pad) + +## S4 method for signature 'Column,character,numeric' +substring_index(x, delim, count) + +## S4 method for signature 'Column,character,character' +translate(x, matchingString, + replaceString) + +## S4 method for signature 'Column,character' +split_string(x, pattern) + +## S4 method for signature 'Column,numeric' +repeat_string(x, n) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>Column to compute on except in the following methods: +</p> + +<ul> +<li> <p><code>instr</code>: <code>character</code>, the substring to check. See 'Details'. +</p> +</li> +<li> <p><code>format_number</code>: <code>numeric</code>, the number of decimal place to +format to. See 'Details'. +</p> +</li></ul> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional Columns.</p> +</td></tr> +<tr valign="top"><td><code>sep</code></td> +<td> +<p>separator to use.</p> +</td></tr> +<tr valign="top"><td><code>charset</code></td> +<td> +<p>character set to use (one of "US-ASCII", "ISO-8859-1", "UTF-8", "UTF-16BE", +"UTF-16LE", "UTF-16").</p> +</td></tr> +<tr valign="top"><td><code>y</code></td> +<td> +<p>Column to compute on.</p> +</td></tr> +<tr valign="top"><td><code>format</code></td> +<td> +<p>a character object of format strings.</p> +</td></tr> +<tr valign="top"><td><code>substr</code></td> +<td> +<p>a character string to be matched.</p> +</td></tr> +<tr valign="top"><td><code>str</code></td> +<td> +<p>a Column where matches are sought for each entry.</p> +</td></tr> +<tr valign="top"><td><code>len</code></td> +<td> +<p>maximum length of each output result.</p> +</td></tr> +<tr valign="top"><td><code>pad</code></td> +<td> +<p>a character string to be padded with.</p> +</td></tr> +<tr valign="top"><td><code>trimString</code></td> +<td> +<p>a character string to trim with</p> +</td></tr> +<tr valign="top"><td><code>pattern</code></td> +<td> +<p>a regular expression.</p> +</td></tr> +<tr valign="top"><td><code>idx</code></td> +<td> +<p>a group index.</p> +</td></tr> +<tr valign="top"><td><code>replacement</code></td> +<td> +<p>a character string that a matched <code>pattern</code> is replaced with.</p> +</td></tr> +<tr valign="top"><td><code>n</code></td> +<td> +<p>number of repetitions.</p> +</td></tr> +<tr valign="top"><td><code>delim</code></td> +<td> +<p>a delimiter string.</p> +</td></tr> +<tr valign="top"><td><code>count</code></td> +<td> +<p>number of occurrences of <code>delim</code> before the substring is returned. +A positive number means counting from the left, while negative means +counting from the right.</p> +</td></tr> +<tr valign="top"><td><code>matchingString</code></td> +<td> +<p>a source string where each character will be translated.</p> +</td></tr> +<tr valign="top"><td><code>replaceString</code></td> +<td> +<p>a target string where each <code>matchingString</code> character will +be replaced by the character in <code>replaceString</code> +at the same location, if any.</p> +</td></tr> +<tr valign="top"><td><code>pos</code></td> +<td> +<p>start position of search.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>ascii</code>: Computes the numeric value of the first character of the string column, +and returns the result as an int column. +</p> +<p><code>base64</code>: Computes the BASE64 encoding of a binary column and returns it as +a string column. This is the reverse of unbase64. +</p> +<p><code>decode</code>: Computes the first argument into a string from a binary using the provided +character set. +</p> +<p><code>encode</code>: Computes the first argument into a binary from a string using the provided +character set. +</p> +<p><code>initcap</code>: Returns a new string column by converting the first letter of +each word to uppercase. Words are delimited by whitespace. For example, "hello world" +will become "Hello World". +</p> +<p><code>length</code>: Computes the length of a given string or binary column. +</p> +<p><code>lower</code>: Converts a string column to lower case. +</p> +<p><code>ltrim</code>: Trims the spaces from left end for the specified string value. Optionally a +<code>trimString</code> can be specified. +</p> +<p><code>reverse</code>: Reverses the string column and returns it as a new string column. +</p> +<p><code>rtrim</code>: Trims the spaces from right end for the specified string value. Optionally a +<code>trimString</code> can be specified. +</p> +<p><code>soundex</code>: Returns the soundex code for the specified expression. +</p> +<p><code>trim</code>: Trims the spaces from both ends for the specified string column. Optionally a +<code>trimString</code> can be specified. +</p> +<p><code>unbase64</code>: Decodes a BASE64 encoded string column and returns it as a binary column. +This is the reverse of base64. +</p> +<p><code>upper</code>: Converts a string column to upper case. +</p> +<p><code>levenshtein</code>: Computes the Levenshtein distance of the two given string columns. +</p> +<p><code>concat</code>: Concatenates multiple input columns together into a single column. +If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. +</p> +<p><code>instr</code>: Locates the position of the first occurrence of a substring (<code>x</code>) +in the given string column (<code>y</code>). Returns null if either of the arguments are null. +Note: The position is not zero based, but 1 based index. Returns 0 if the substring +could not be found in the string column. +</p> +<p><code>format_number</code>: Formats numeric column <code>y</code> to a format like '#,###,###.##', +rounded to <code>x</code> decimal places with HALF_EVEN round mode, and returns the result +as a string column. +If <code>x</code> is 0, the result has no decimal point or fractional part. +If <code>x</code> < 0, the result will be null. +</p> +<p><code>concat_ws</code>: Concatenates multiple input string columns together into a single +string column, using the given separator. +</p> +<p><code>format_string</code>: Formats the arguments in printf-style and returns the result +as a string column. +</p> +<p><code>locate</code>: Locates the position of the first occurrence of substr. +Note: The position is not zero based, but 1 based index. Returns 0 if substr +could not be found in str. +</p> +<p><code>lpad</code>: Left-padded with pad to a length of len. +</p> +<p><code>regexp_extract</code>: Extracts a specific <code>idx</code> group identified by a Java regex, +from the specified string column. If the regex did not match, or the specified group did +not match, an empty string is returned. +</p> +<p><code>regexp_replace</code>: Replaces all substrings of the specified string value that +match regexp with rep. +</p> +<p><code>rpad</code>: Right-padded with pad to a length of len. +</p> +<p><code>substring_index</code>: Returns the substring from string (<code>x</code>) before <code>count</code> +occurrences of the delimiter (<code>delim</code>). If <code>count</code> is positive, everything the left of +the final delimiter (counting from left) is returned. If <code>count</code> is negative, every to the +right of the final delimiter (counting from the right) is returned. <code>substring_index</code> +performs a case-sensitive match when searching for the delimiter. +</p> +<p><code>translate</code>: Translates any character in the src by a character in replaceString. +The characters in replaceString is corresponding to the characters in matchingString. +The translate will happen when any character in the string matching with the character +in the matchingString. +</p> +<p><code>split_string</code>: Splits string on regular expression. +Equivalent to <code>split</code> SQL function. +</p> +<p><code>repeat_string</code>: Repeats string n times. +Equivalent to <code>repeat</code> SQL function. +</p> + + +<h3>Note</h3> + +<p>ascii since 1.5.0 +</p> +<p>base64 since 1.5.0 +</p> +<p>decode since 1.6.0 +</p> +<p>encode since 1.6.0 +</p> +<p>initcap since 1.5.0 +</p> +<p>length since 1.5.0 +</p> +<p>lower since 1.4.0 +</p> +<p>ltrim since 1.5.0 +</p> +<p>ltrim(Column, character) since 2.3.0 +</p> +<p>reverse since 1.5.0 +</p> +<p>rtrim since 1.5.0 +</p> +<p>rtrim(Column, character) since 2.3.0 +</p> +<p>soundex since 1.5.0 +</p> +<p>trim since 1.5.0 +</p> +<p>trim(Column, character) since 2.3.0 +</p> +<p>unbase64 since 1.5.0 +</p> +<p>upper since 1.4.0 +</p> +<p>levenshtein since 1.5.0 +</p> +<p>concat since 1.5.0 +</p> +<p>instr since 1.5.0 +</p> +<p>format_number since 1.5.0 +</p> +<p>concat_ws since 1.5.0 +</p> +<p>format_string since 1.5.0 +</p> +<p>locate since 1.5.0 +</p> +<p>lpad since 1.5.0 +</p> +<p>regexp_extract since 1.5.0 +</p> +<p>regexp_replace since 1.5.0 +</p> +<p>rpad since 1.5.0 +</p> +<p>substring_index since 1.5.0 +</p> +<p>translate since 1.5.0 +</p> +<p>split_string 2.3.0 +</p> +<p>repeat_string since 2.3.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D # Dataframe used throughout this doc +##D df <- createDataFrame(as.data.frame(Titanic, stringsAsFactors = FALSE)) +## End(Not run) + +## Not run: +##D head(select(df, ascii(df$Class), ascii(df$Sex))) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, s1 = encode(df$Class, "UTF-8")) +##D str(tmp) +##D tmp2 <- mutate(tmp, s2 = base64(tmp$s1), s3 = decode(tmp$s1, "UTF-8"), +##D s4 = soundex(tmp$Sex)) +##D head(tmp2) +##D head(select(tmp2, unbase64(tmp2$s2))) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, sex_lower = lower(df$Sex), age_upper = upper(df$age), +##D sex_age = concat_ws(" ", lower(df$sex), lower(df$age))) +##D head(tmp) +##D tmp2 <- mutate(tmp, s1 = initcap(tmp$sex_lower), s2 = initcap(tmp$sex_age), +##D s3 = reverse(df$Sex)) +##D head(tmp2) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, SexLpad = lpad(df$Sex, 6, " "), SexRpad = rpad(df$Sex, 7, " ")) +##D head(select(tmp, length(tmp$Sex), length(tmp$SexLpad), length(tmp$SexRpad))) +##D tmp2 <- mutate(tmp, SexLtrim = ltrim(tmp$SexLpad), SexRtrim = rtrim(tmp$SexRpad), +##D SexTrim = trim(tmp$SexLpad)) +##D head(select(tmp2, length(tmp2$Sex), length(tmp2$SexLtrim), +##D length(tmp2$SexRtrim), length(tmp2$SexTrim))) +##D +##D tmp <- mutate(df, SexLpad = lpad(df$Sex, 6, "xx"), SexRpad = rpad(df$Sex, 7, "xx")) +##D head(tmp) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, d1 = levenshtein(df$Class, df$Sex), +##D d2 = levenshtein(df$Age, df$Sex), +##D d3 = levenshtein(df$Age, df$Age)) +##D head(tmp) +## End(Not run) + +## Not run: +##D # concatenate strings +##D tmp <- mutate(df, s1 = concat(df$Class, df$Sex), +##D s2 = concat(df$Class, df$Sex, df$Age), +##D s3 = concat(df$Class, df$Sex, df$Age, df$Class), +##D s4 = concat_ws("_", df$Class, df$Sex), +##D s5 = concat_ws("+", df$Class, df$Sex, df$Age, df$Survived)) +##D head(tmp) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, s1 = instr(df$Sex, "m"), s2 = instr(df$Sex, "M"), +##D s3 = locate("m", df$Sex), s4 = locate("m", df$Sex, pos = 4)) +##D head(tmp) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, v1 = df$Freq/3) +##D head(select(tmp, format_number(tmp$v1, 0), format_number(tmp$v1, 2), +##D format_string("%4.2f %s", tmp$v1, tmp$Sex)), 10) +## End(Not run) + +## Not run: +##D tmp <- mutate(df, s1 = regexp_extract(df$Class, "(\\d+)\\w+", 1), +##D s2 = regexp_extract(df$Sex, "^(\\w)\\w+", 1), +##D s3 = regexp_replace(df$Class, "\\D+", ""), +##D s4 = substring_index(df$Sex, "a", 1), +##D s5 = substring_index(df$Sex, "a", -1), +##D s6 = translate(df$Sex, "ale", ""), +##D s7 = translate(df$Sex, "a", "-")) +##D head(tmp) +## End(Not run) + +## Not run: +##D head(select(df, split_string(df$Sex, "a"))) +##D head(select(df, split_string(df$Class, "\\d"))) +##D # This is equivalent to the following SQL expression +##D head(selectExpr(df, "split(Class, '\\\\d')")) +## End(Not run) + +## Not run: +##D head(select(df, repeat_string(df$Class, 3))) +##D # This is equivalent to the following SQL expression +##D head(selectExpr(df, "repeat(Class, 3)")) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/column_window_functions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/column_window_functions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/column_window_functions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,187 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Window functions for Column operations</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for column_window_functions {SparkR}"><tr><td>column_window_functions {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Window functions for Column operations</h2> + +<h3>Description</h3> + +<p>Window functions defined for <code>Column</code>. +</p> + + +<h3>Usage</h3> + +<pre> +cume_dist(x = "missing") + +dense_rank(x = "missing") + +lag(x, ...) + +lead(x, offset, defaultValue = NULL) + +ntile(x) + +percent_rank(x = "missing") + +rank(x, ...) + +row_number(x = "missing") + +## S4 method for signature 'missing' +cume_dist() + +## S4 method for signature 'missing' +dense_rank() + +## S4 method for signature 'characterOrColumn' +lag(x, offset = 1, defaultValue = NULL) + +## S4 method for signature 'characterOrColumn,numeric' +lead(x, offset = 1, + defaultValue = NULL) + +## S4 method for signature 'numeric' +ntile(x) + +## S4 method for signature 'missing' +percent_rank() + +## S4 method for signature 'missing' +rank() + +## S4 method for signature 'ANY' +rank(x, ...) + +## S4 method for signature 'missing' +row_number() +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>In <code>lag</code> and <code>lead</code>, it is the column as a character string or a Column +to compute on. In <code>ntile</code>, it is the number of ntile groups.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s).</p> +</td></tr> +<tr valign="top"><td><code>offset</code></td> +<td> +<p>In <code>lag</code>, the number of rows back from the current row from which to obtain +a value. In <code>lead</code>, the number of rows after the current row from which to +obtain a value. If not specified, the default is 1.</p> +</td></tr> +<tr valign="top"><td><code>defaultValue</code></td> +<td> +<p>(optional) default to use when the offset row does not exist.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>cume_dist</code>: Returns the cumulative distribution of values within a window partition, +i.e. the fraction of rows that are below the current row: +(number of values before and including x) / (total number of rows in the partition). +This is equivalent to the <code>CUME_DIST</code> function in SQL. +The method should be used with no argument. +</p> +<p><code>dense_rank</code>: Returns the rank of rows within a window partition, without any gaps. +The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking +sequence when there are ties. That is, if you were ranking a competition using dense_rank +and had three people tie for second place, you would say that all three were in second +place and that the next person came in third. Rank would give me sequential numbers, making +the person that came in third place (after the ties) would register as coming in fifth. +This is equivalent to the <code>DENSE_RANK</code> function in SQL. +The method should be used with no argument. +</p> +<p><code>lag</code>: Returns the value that is <code>offset</code> rows before the current row, and +<code>defaultValue</code> if there is less than <code>offset</code> rows before the current row. For example, +an <code>offset</code> of one will return the previous row at any given point in the window partition. +This is equivalent to the <code>LAG</code> function in SQL. +</p> +<p><code>lead</code>: Returns the value that is <code>offset</code> rows after the current row, and +<code>defaultValue</code> if there is less than <code>offset</code> rows after the current row. +For example, an <code>offset</code> of one will return the next row at any given point +in the window partition. +This is equivalent to the <code>LEAD</code> function in SQL. +</p> +<p><code>ntile</code>: Returns the ntile group id (from 1 to n inclusive) in an ordered window +partition. For example, if n is 4, the first quarter of the rows will get value 1, the second +quarter will get 2, the third quarter will get 3, and the last quarter will get 4. +This is equivalent to the <code>NTILE</code> function in SQL. +</p> +<p><code>percent_rank</code>: Returns the relative rank (i.e. percentile) of rows within a window +partition. +This is computed by: (rank of row in its partition - 1) / (number of rows in the partition - 1). +This is equivalent to the <code>PERCENT_RANK</code> function in SQL. +The method should be used with no argument. +</p> +<p><code>rank</code>: Returns the rank of rows within a window partition. +The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking +sequence when there are ties. That is, if you were ranking a competition using dense_rank +and had three people tie for second place, you would say that all three were in second +place and that the next person came in third. Rank would give me sequential numbers, making +the person that came in third place (after the ties) would register as coming in fifth. +This is equivalent to the <code>RANK</code> function in SQL. +The method should be used with no argument. +</p> +<p><code>row_number</code>: Returns a sequential number starting at 1 within a window partition. +This is equivalent to the <code>ROW_NUMBER</code> function in SQL. +The method should be used with no argument. +</p> + + +<h3>Note</h3> + +<p>cume_dist since 1.6.0 +</p> +<p>dense_rank since 1.6.0 +</p> +<p>lag since 1.6.0 +</p> +<p>lead since 1.6.0 +</p> +<p>ntile since 1.6.0 +</p> +<p>percent_rank since 1.6.0 +</p> +<p>rank since 1.6.0 +</p> +<p>row_number since 1.6.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D # Dataframe used throughout this doc +##D df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)) +##D ws <- orderBy(windowPartitionBy("am"), "hp") +##D tmp <- mutate(df, dist = over(cume_dist(), ws), dense_rank = over(dense_rank(), ws), +##D lag = over(lag(df$mpg), ws), lead = over(lead(df$mpg, 1), ws), +##D percent_rank = over(percent_rank(), ws), +##D rank = over(rank(), ws), row_number = over(row_number(), ws)) +##D # Get ntile group id (1-4) for hp +##D tmp <- mutate(tmp, ntile = over(ntile(4), ws)) +##D head(tmp) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/columnfunctions.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/columnfunctions.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/columnfunctions.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,55 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: A set of operations working with SparkDataFrame columns</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> +</head><body> + +<table width="100%" summary="page for asc {SparkR}"><tr><td>asc {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>A set of operations working with SparkDataFrame columns</h2> + +<h3>Description</h3> + +<p>A set of operations working with SparkDataFrame columns +</p> + + +<h3>Usage</h3> + +<pre> +asc(x) + +contains(x, ...) + +desc(x) + +getField(x, ...) + +getItem(x, ...) + +isNaN(x) + +isNull(x) + +isNotNull(x) + +like(x, ...) + +rlike(x, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>a Column object.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s).</p> +</td></tr> +</table> + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/columns.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/columns.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/columns.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,144 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Column Names of SparkDataFrame</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for colnames {SparkR}"><tr><td>colnames {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Column Names of SparkDataFrame</h2> + +<h3>Description</h3> + +<p>Return a vector of column names. +</p> + + +<h3>Usage</h3> + +<pre> +colnames(x, do.NULL = TRUE, prefix = "col") + +colnames(x) <- value + +columns(x) + +## S4 method for signature 'SparkDataFrame' +columns(x) + +## S4 method for signature 'SparkDataFrame' +names(x) + +## S4 replacement method for signature 'SparkDataFrame' +names(x) <- value + +## S4 method for signature 'SparkDataFrame' +colnames(x) + +## S4 replacement method for signature 'SparkDataFrame' +colnames(x) <- value +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>a SparkDataFrame.</p> +</td></tr> +<tr valign="top"><td><code>do.NULL</code></td> +<td> +<p>currently not used.</p> +</td></tr> +<tr valign="top"><td><code>prefix</code></td> +<td> +<p>currently not used.</p> +</td></tr> +<tr valign="top"><td><code>value</code></td> +<td> +<p>a character vector. Must have the same length as the number +of columns to be renamed.</p> +</td></tr> +</table> + + +<h3>Note</h3> + +<p>columns since 1.4.0 +</p> +<p>names since 1.5.0 +</p> +<p>names<- since 1.5.0 +</p> +<p>colnames since 1.6.0 +</p> +<p>colnames<- since 1.6.0 +</p> + + +<h3>See Also</h3> + +<p>Other SparkDataFrame functions: <code><a href="SparkDataFrame.html">SparkDataFrame-class</a></code>, +<code><a href="summarize.html">agg</a></code>, <code><a href="alias.html">alias</a></code>, +<code><a href="arrange.html">arrange</a></code>, <code><a href="as.data.frame.html">as.data.frame</a></code>, +<code><a href="attach.html">attach,SparkDataFrame-method</a></code>, +<code><a href="broadcast.html">broadcast</a></code>, <code><a href="cache.html">cache</a></code>, +<code><a href="checkpoint.html">checkpoint</a></code>, <code><a href="coalesce.html">coalesce</a></code>, +<code><a href="collect.html">collect</a></code>, <code><a href="coltypes.html">coltypes</a></code>, +<code><a href="createOrReplaceTempView.html">createOrReplaceTempView</a></code>, +<code><a href="crossJoin.html">crossJoin</a></code>, <code><a href="cube.html">cube</a></code>, +<code><a href="dapplyCollect.html">dapplyCollect</a></code>, <code><a href="dapply.html">dapply</a></code>, +<code><a href="describe.html">describe</a></code>, <code><a href="dim.html">dim</a></code>, +<code><a href="distinct.html">distinct</a></code>, <code><a href="dropDuplicates.html">dropDuplicates</a></code>, +<code><a href="nafunctions.html">dropna</a></code>, <code><a href="drop.html">drop</a></code>, +<code><a href="dtypes.html">dtypes</a></code>, <code><a href="except.html">except</a></code>, +<code><a href="explain.html">explain</a></code>, <code><a href="filter.html">filter</a></code>, +<code><a href="first.html">first</a></code>, <code><a href="gapplyCollect.html">gapplyCollect</a></code>, +<code><a href="gapply.html">gapply</a></code>, <code><a href="getNumPartitions.html">getNumPartitions</a></code>, +<code><a href="groupBy.html">group_by</a></code>, <code><a href="head.html">head</a></code>, +<code><a href="hint.html">hint</a></code>, <code><a href="histogram.html">histogram</a></code>, +<code><a href="insertInto.html">insertInto</a></code>, <code><a href="intersect.html">intersect</a></code>, +<code><a href="isLocal.html">isLocal</a></code>, <code><a href="isStreaming.html">isStreaming</a></code>, +<code><a href="join.html">join</a></code>, <code><a href="limit.html">limit</a></code>, +<code><a href="localCheckpoint.html">localCheckpoint</a></code>, <code><a href="merge.html">merge</a></code>, +<code><a href="mutate.html">mutate</a></code>, <code><a href="ncol.html">ncol</a></code>, +<code><a href="nrow.html">nrow</a></code>, <code><a href="persist.html">persist</a></code>, +<code><a href="printSchema.html">printSchema</a></code>, <code><a href="randomSplit.html">randomSplit</a></code>, +<code><a href="rbind.html">rbind</a></code>, <code><a href="registerTempTable-deprecated.html">registerTempTable</a></code>, +<code><a href="rename.html">rename</a></code>, <code><a href="repartition.html">repartition</a></code>, +<code><a href="rollup.html">rollup</a></code>, <code><a href="sample.html">sample</a></code>, +<code><a href="saveAsTable.html">saveAsTable</a></code>, <code><a href="schema.html">schema</a></code>, +<code><a href="selectExpr.html">selectExpr</a></code>, <code><a href="select.html">select</a></code>, +<code><a href="showDF.html">showDF</a></code>, <code><a href="show.html">show</a></code>, +<code><a href="storageLevel.html">storageLevel</a></code>, <code><a href="str.html">str</a></code>, +<code><a href="subset.html">subset</a></code>, <code><a href="summary.html">summary</a></code>, +<code><a href="take.html">take</a></code>, <code><a href="toJSON.html">toJSON</a></code>, +<code><a href="unionByName.html">unionByName</a></code>, <code><a href="union.html">union</a></code>, +<code><a href="unpersist.html">unpersist</a></code>, <code><a href="withColumn.html">withColumn</a></code>, +<code><a href="withWatermark.html">withWatermark</a></code>, <code><a href="with.html">with</a></code>, +<code><a href="write.df.html">write.df</a></code>, <code><a href="write.jdbc.html">write.jdbc</a></code>, +<code><a href="write.json.html">write.json</a></code>, <code><a href="write.orc.html">write.orc</a></code>, +<code><a href="write.parquet.html">write.parquet</a></code>, <code><a href="write.stream.html">write.stream</a></code>, +<code><a href="write.text.html">write.text</a></code> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D path <- "path/to/file.json" +##D df <- read.json(path) +##D columns(df) +##D colnames(df) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/corr.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/corr.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/corr.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,113 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: corr</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for corr {SparkR}"><tr><td>corr {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>corr</h2> + +<h3>Description</h3> + +<p>Computes the Pearson Correlation Coefficient for two Columns. +</p> +<p>Calculates the correlation of two columns of a SparkDataFrame. +Currently only supports the Pearson Correlation Coefficient. +For Spearman Correlation, consider using RDD methods found in MLlib's Statistics. +</p> + + +<h3>Usage</h3> + +<pre> +corr(x, ...) + +## S4 method for signature 'Column' +corr(x, col2) + +## S4 method for signature 'SparkDataFrame' +corr(x, colName1, colName2, method = "pearson") +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>a Column or a SparkDataFrame.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s). If <code>x</code> is a Column, a Column +should be provided. If <code>x</code> is a SparkDataFrame, two column names should +be provided.</p> +</td></tr> +<tr valign="top"><td><code>col2</code></td> +<td> +<p>a (second) Column.</p> +</td></tr> +<tr valign="top"><td><code>colName1</code></td> +<td> +<p>the name of the first column</p> +</td></tr> +<tr valign="top"><td><code>colName2</code></td> +<td> +<p>the name of the second column</p> +</td></tr> +<tr valign="top"><td><code>method</code></td> +<td> +<p>Optional. A character specifying the method for calculating the correlation. +only "pearson" is allowed now.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p>The Pearson Correlation Coefficient as a Double. +</p> + + +<h3>Note</h3> + +<p>corr since 1.6.0 +</p> +<p>corr since 1.6.0 +</p> + + +<h3>See Also</h3> + +<p>Other aggregate functions: <code><a href="avg.html">avg</a></code>, +<code><a href="column_aggregate_functions.html">column_aggregate_functions</a></code>, +<code><a href="count.html">count</a></code>, <code><a href="cov.html">cov</a></code>, +<code><a href="first.html">first</a></code>, <code><a href="last.html">last</a></code> +</p> +<p>Other stat functions: <code><a href="approxQuantile.html">approxQuantile</a></code>, +<code><a href="cov.html">cov</a></code>, <code><a href="crosstab.html">crosstab</a></code>, +<code><a href="freqItems.html">freqItems</a></code>, <code><a href="sampleBy.html">sampleBy</a></code> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)) +##D head(select(df, corr(df$mpg, df$hp))) +## End(Not run) + +## Not run: +##D corr(df, "mpg", "hp") +##D corr(df, "mpg", "hp", method = "pearson") +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/count.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/count.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/count.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,89 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Count</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for count {SparkR}"><tr><td>count {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Count</h2> + +<h3>Description</h3> + +<p>Count the number of rows for each group when we have <code>GroupedData</code> input. +The resulting SparkDataFrame will also contain the grouping columns. +</p> +<p>This can be used as a column aggregate function with <code>Column</code> as input, +and returns the number of items in a group. +</p> + + +<h3>Usage</h3> + +<pre> +count(x) + +n(x) + +## S4 method for signature 'GroupedData' +count(x) + +## S4 method for signature 'Column' +count(x) + +## S4 method for signature 'Column' +n(x) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>a GroupedData or Column.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p>A SparkDataFrame. +</p> + + +<h3>Note</h3> + +<p>count since 1.4.0 +</p> +<p>count since 1.4.0 +</p> +<p>n since 1.4.0 +</p> + + +<h3>See Also</h3> + +<p>Other aggregate functions: <code><a href="avg.html">avg</a></code>, +<code><a href="column_aggregate_functions.html">column_aggregate_functions</a></code>, +<code><a href="corr.html">corr</a></code>, <code><a href="cov.html">cov</a></code>, +<code><a href="first.html">first</a></code>, <code><a href="last.html">last</a></code> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D count(groupBy(df, "name")) +## End(Not run) +## Not run: count(df$c) +## Not run: n(df$c) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/cov.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/cov.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/cov.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,137 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: cov</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for cov {SparkR}"><tr><td>cov {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>cov</h2> + +<h3>Description</h3> + +<p>Compute the covariance between two expressions. +</p> + + +<h3>Usage</h3> + +<pre> +cov(x, ...) + +covar_samp(col1, col2) + +covar_pop(col1, col2) + +## S4 method for signature 'characterOrColumn' +cov(x, col2) + +## S4 method for signature 'characterOrColumn,characterOrColumn' +covar_samp(col1, col2) + +## S4 method for signature 'characterOrColumn,characterOrColumn' +covar_pop(col1, col2) + +## S4 method for signature 'SparkDataFrame' +cov(x, colName1, colName2) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>a Column or a SparkDataFrame.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s). If <code>x</code> is a Column, a Column +should be provided. If <code>x</code> is a SparkDataFrame, two column names should +be provided.</p> +</td></tr> +<tr valign="top"><td><code>col1</code></td> +<td> +<p>the first Column.</p> +</td></tr> +<tr valign="top"><td><code>col2</code></td> +<td> +<p>the second Column.</p> +</td></tr> +<tr valign="top"><td><code>colName1</code></td> +<td> +<p>the name of the first column</p> +</td></tr> +<tr valign="top"><td><code>colName2</code></td> +<td> +<p>the name of the second column</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p><code>cov</code>: Compute the sample covariance between two expressions. +</p> +<p><code>covar_sample</code>: Alias for <code>cov</code>. +</p> +<p><code>covar_pop</code>: Computes the population covariance between two expressions. +</p> +<p><code>cov</code>: When applied to SparkDataFrame, this calculates the sample covariance of two +numerical columns of <em>one</em> SparkDataFrame. +</p> + + +<h3>Value</h3> + +<p>The covariance of the two columns. +</p> + + +<h3>Note</h3> + +<p>cov since 1.6.0 +</p> +<p>covar_samp since 2.0.0 +</p> +<p>covar_pop since 2.0.0 +</p> +<p>cov since 1.6.0 +</p> + + +<h3>See Also</h3> + +<p>Other aggregate functions: <code><a href="avg.html">avg</a></code>, +<code><a href="column_aggregate_functions.html">column_aggregate_functions</a></code>, +<code><a href="corr.html">corr</a></code>, <code><a href="count.html">count</a></code>, +<code><a href="first.html">first</a></code>, <code><a href="last.html">last</a></code> +</p> +<p>Other stat functions: <code><a href="approxQuantile.html">approxQuantile</a></code>, +<code><a href="corr.html">corr</a></code>, <code><a href="crosstab.html">crosstab</a></code>, +<code><a href="freqItems.html">freqItems</a></code>, <code><a href="sampleBy.html">sampleBy</a></code> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D df <- createDataFrame(cbind(model = rownames(mtcars), mtcars)) +##D head(select(df, cov(df$mpg, df$hp), cov("mpg", "hp"), +##D covar_samp(df$mpg, df$hp), covar_samp("mpg", "hp"), +##D covar_pop(df$mpg, df$hp), covar_pop("mpg", "hp"))) +## End(Not run) + +## Not run: +##D cov(df, "mpg", "hp") +##D cov(df, df$mpg, df$hp) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/createDataFrame.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/createDataFrame.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/createDataFrame.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,90 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Create a SparkDataFrame</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for createDataFrame {SparkR}"><tr><td>createDataFrame {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>Create a SparkDataFrame</h2> + +<h3>Description</h3> + +<p>Converts R data.frame or list into SparkDataFrame. +</p> + + +<h3>Usage</h3> + +<pre> +## Default S3 method: +createDataFrame(data, schema = NULL, samplingRatio = 1, + numPartitions = NULL) + +## Default S3 method: +as.DataFrame(data, schema = NULL, samplingRatio = 1, + numPartitions = NULL) + +as.DataFrame(data, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>a list or data.frame.</p> +</td></tr> +<tr valign="top"><td><code>schema</code></td> +<td> +<p>a list of column names or named list (StructType), optional.</p> +</td></tr> +<tr valign="top"><td><code>samplingRatio</code></td> +<td> +<p>Currently not used.</p> +</td></tr> +<tr valign="top"><td><code>numPartitions</code></td> +<td> +<p>the number of partitions of the SparkDataFrame. Defaults to 1, this is +limited by length of the list or number of rows of the data.frame</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s).</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p>A SparkDataFrame. +</p> + + +<h3>Note</h3> + +<p>createDataFrame since 1.4.0 +</p> +<p>as.DataFrame since 1.6.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D df1 <- as.DataFrame(iris) +##D df2 <- as.DataFrame(list(3,4,5,6)) +##D df3 <- createDataFrame(iris) +##D df4 <- createDataFrame(cars, numPartitions = 2) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> Added: dev/spark/v2.3.0-rc1-docs/_site/api/R/createExternalTable-deprecated.html ============================================================================== --- dev/spark/v2.3.0-rc1-docs/_site/api/R/createExternalTable-deprecated.html (added) +++ dev/spark/v2.3.0-rc1-docs/_site/api/R/createExternalTable-deprecated.html Sat Jan 13 10:29:47 2018 @@ -0,0 +1,93 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: (Deprecated) Create an external table</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> +<link rel="stylesheet" type="text/css" href="R.css" /> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for createExternalTable {SparkR}"><tr><td>createExternalTable {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> + +<h2>(Deprecated) Create an external table</h2> + +<h3>Description</h3> + +<p>Creates an external table based on the dataset in a data source, +Returns a SparkDataFrame associated with the external table. +</p> + + +<h3>Usage</h3> + +<pre> +## Default S3 method: +createExternalTable(tableName, path = NULL, source = NULL, + schema = NULL, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>tableName</code></td> +<td> +<p>a name of the table.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>the path of files to load.</p> +</td></tr> +<tr valign="top"><td><code>source</code></td> +<td> +<p>the name of external data source.</p> +</td></tr> +<tr valign="top"><td><code>schema</code></td> +<td> +<p>the schema of the data required for some data sources.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s) passed to the method.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>The data source is specified by the <code>source</code> and a set of options(...). +If <code>source</code> is not specified, the default data source configured by +"spark.sql.sources.default" will be used. +</p> + + +<h3>Value</h3> + +<p>A SparkDataFrame. +</p> + + +<h3>Note</h3> + +<p>createExternalTable since 1.4.0 +</p> + + +<h3>See Also</h3> + +<p><a href="createTable.html">createTable</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D df <- createExternalTable("myjson", path="path/to/json", source="json", schema) +## End(Not run) +</code></pre> + + +<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.0 <a href="00Index.html">Index</a>]</div> +</body></html> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org