[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17308 **[Test build #77227 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77227/testReport)** for PR 17308 at commit [`15dfc80`](https://github.com/apache/spark/commit/15dfc80a8a35208f5f9df150de7c4bd9a015e2d8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18025: [WIP][SparkR] Update doc and examples for sql functions
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 @felixcheung I think we may want to distinguish a few cases: 1. For methods that are mainly defined by only one class, e.g., most function methods for Column, it makes sense to group and document them together. For example, most aggregate functions of Column go into one single Rd, since they are not defined for other classes. In this case, `avg` will go to this doc since it is not used by other classes. 2. For methods that are defined by multiple classes, e.g., the `show` method defined for SparkDataFrame, GroupedData, Column and StreamingQuery, we can still document them in `show.Rd`. In this case, `show` will go to this doc and shows the help for all classes that have defined a `show` method. 3. When it makes sense, we can also combine 1 & 2 above. For example, `gapply` and `gapplyCollecte` are defined for both SparkDataFrame and GroupedData. But we can still document them together and create shared examples. Let me know if this makes sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTree
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18067 **[Test build #77226 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77226/testReport)** for PR 18067 at commit [`f43ebe0`](https://github.com/apache/spark/commit/f43ebe03115b0b22ed01b76925312dfbc7a2c8c0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/18067 [SPARK-20849][DOC][SPARKR] Document R DecisionTree ## What changes were proposed in this pull request? 1, add an example for sparkr `decisionTree` 2, document it in user guide ## How was this patch tested? local submit You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhengruifeng/spark dt_example Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18067.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18067 commit 3d8172f98f0994fec9ff359dfca4e6fcddd85863 Author: Zheng RuiFengDate: 2017-05-23T03:56:20Z create pr commit def3ef4635094955c20c7e9511ce681378794d34 Author: Zheng RuiFeng Date: 2017-05-23T04:33:33Z update vignettes commit f43ebe03115b0b22ed01b76925312dfbc7a2c8c0 Author: Zheng RuiFeng Date: 2017-05-23T05:44:44Z update sparkr.md --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17308 **[Test build #77225 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77225/testReport)** for PR 17308 at commit [`ef2d6cd`](https://github.com/apache/spark/commit/ef2d6cd4275d93518ec27d4b08916575a3e597d7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77220/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77220 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77220/testReport)** for PR 18064 at commit [`b355c6d`](https://github.com/apache/spark/commit/b355c6d034c6aefcf8f74757353afce870e9bf1d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait Command extends LogicalPlan ` * `case class ExecutedCommandExec(cmd: RunnableCommand, children: Seq[SparkPlan]) extends SparkPlan ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16989: [WIP][SPARK-19659] Fetch big blocks to disk when shuffle...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16989 **[Test build #77224 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77224/testReport)** for PR 16989 at commit [`e022b6d`](https://github.com/apache/spark/commit/e022b6d4ccab0f7fc7b47a468b23046a11576311). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17698 @10110346 Hi, you can use the command @gatorsmile mentioned above to generate the result file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18066 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77222/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18066 **[Test build #77222 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77222/testReport)** for PR 18066 at commit [`6ed3d3f`](https://github.com/apache/spark/commit/6ed3d3fa51cd9b09e2f137bda87dcb16e5a9fb1a). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class GenerateColumnAccessor(useColumnarBatch: Boolean)` * `class GenerateColumnarBatch( ` * ` class GeneratedColumnarBatchIterator extends $` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18066 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77219/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77219/testReport)** for PR 18064 at commit [`9507f19`](https://github.com/apache/spark/commit/9507f1938f894b2884b024c8472084a3a531e20d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait Command extends LogicalPlan ` * `case class ExecutedCommandExec(cmd: RunnableCommand, children: Seq[SparkPlan]) extends SparkPlan ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16989: [WIP][SPARK-19659] Fetch big blocks to disk when shuffle...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16989 **[Test build #77223 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77223/testReport)** for PR 16989 at commit [`9b733ec`](https://github.com/apache/spark/commit/9b733ec0fbc4bad8fc7f2413af1be5c6f718d9c1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18066 **[Test build #77222 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77222/testReport)** for PR 18066 at commit [`6ed3d3f`](https://github.com/apache/spark/commit/6ed3d3fa51cd9b09e2f137bda87dcb16e5a9fb1a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14957: [SPARK-4502][SQL]Support parquet nested struct pruning a...
Github user Gauravshah commented on the issue: https://github.com/apache/spark/pull/14957 @saulshanabrook looks like #16578 is a superset, trying to invest in that pull request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18066: [SPARK-20822][SQL] Generate code to build table c...
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/18066 [SPARK-20822][SQL] Generate code to build table cache using ColumnarBatch and to get value from ColumnVector ## What changes were proposed in this pull request? This PR generates the following Java code 1. Build each in-memory table cache using `ColumnarBatch` with `ColumnVector` instead of using CachedBatch with `Array[Byte]`. 2. Get a value for a column in `ColumnVector without using an iterator As the first step, for ease of review, I supported only integer and double data types with whole-stage codegen. Another PR will address an execution path without whole-stage codegen This PR implements the follings: 1. Keep a in-memory table cache using `ColumnarBatch` with `ColumnVector`. For supporting the new and coventional cache data structure, this PR declares `CachedBatch` as trait, and declares `CachedColumnarBatch` and `CachedBatchBytes` as actual implementations. 2. Generate Java code to build a in-memory table cache. 3. Generate Java code to directly get value from `ColumnVector`. This PR improves runtime performance by 1. build in-memory table cache by eliminating lots of virtual calls and complicated data path. 2. eliminating data copy from column-oriented storage to `InternalRow` in a `SpecificColumnarIterator` iterator. **Options** A ColumnVector for all primitive data types in ColumnarBatch can be compressed. Currently, there are two ways to enable compression: 1. Set true into a property `spark.sql.inMemoryColumnarStorage.compressed (default is true)`, or 2. Call `DataFrame.persist(st)`, where st is `MEMORY_ONLY_SER`, `MEMORY_ONLY_SER_2`, `MEMORY_AND_DISK_SER`, or `MEMORY_AND_DISK_SER_2`. **an example program** ```java val df = sparkContext.parallelize((1 to 10), 1).map(i => (i, i.toDouble)).toDF("i", "d").cache df.filter("i < 8 and 4.0 < d").show ``` **Generated code for building a in-memory table cache** ``` /* 001 */ import scala.collection.Iterator; /* 002 */ import org.apache.spark.sql.types.DataType; /* 003 */ import org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder; /* 004 */ import org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter; /* 005 */ import org.apache.spark.sql.execution.columnar.MutableUnsafeRow; /* 006 */ import org.apache.spark.sql.execution.vectorized.ColumnVector; /* 007 */ /* 008 */ public SpecificColumnarIterator generate(Object[] references) { /* 009 */ return new SpecificColumnarIterator(references); /* 010 */ } /* 011 */ /* 012 */ class SpecificColumnarIterator extends org.apache.spark.sql.execution.columnar.ColumnarIterator { /* 013 */ private ColumnVector[] colInstances; /* 014 */ private UnsafeRow unsafeRow = new UnsafeRow(0); /* 015 */ private BufferHolder bufferHolder = new BufferHolder(unsafeRow); /* 016 */ private UnsafeRowWriter rowWriter = new UnsafeRowWriter(bufferHolder, 0); /* 017 */ private MutableUnsafeRow mutableRow = null; /* 018 */ /* 019 */ private int rowIdx = 0; /* 020 */ private int numRowsInBatch = 0; /* 021 */ /* 022 */ private scala.collection.Iterator input = null; /* 023 */ private DataType[] columnTypes = null; /* 024 */ private int[] columnIndexes = null; /* 025 */ /* 026 */ /* 027 */ /* 028 */ public SpecificColumnarIterator(Object[] references) { /* 029 */ /* 030 */ this.mutableRow = new MutableUnsafeRow(rowWriter); /* 031 */ } /* 032 */ /* 033 */ public void initialize(Iterator input, DataType[] columnTypes, int[] columnIndexes) { /* 034 */ this.input = input; /* 035 */ this.columnTypes = columnTypes; /* 036 */ this.columnIndexes = columnIndexes; /* 037 */ } /* 038 */ /* 039 */ /* 040 */ /* 041 */ public boolean hasNext() { /* 042 */ if (rowIdx < numRowsInBatch) { /* 043 */ return true; /* 044 */ } /* 045 */ if (!input.hasNext()) { /* 046 */ return false; /* 047 */ } /* 048 */ /* 049 */ org.apache.spark.sql.execution.columnar.CachedColumnarBatch cachedBatch = /* 050 */ (org.apache.spark.sql.execution.columnar.CachedColumnarBatch) input.next(); /* 051 */ org.apache.spark.sql.execution.vectorized.ColumnarBatch batch = cachedBatch.columnarBatch(); /* 052 */ rowIdx = 0; /* 053 */ numRowsInBatch = cachedBatch.getNumRows(); /* 054 */ colInstances = new ColumnVector[columnIndexes.length]; /* 055 */ for (int i = 0; i < columnIndexes.length; i ++) { /* 056 */ colInstances[i] = batch.column(columnIndexes[i]); /* 057 */ } /* 058 */ /* 059 */ return hasNext(); /* 060 */ } /* 061 */ /* 062 */ public
[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18058 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18051 Maybe I'm missing something completely, but I still don't get the point why we are removing the `xx-method` link since we are defining methods as S4 using `setMethod`. Lots of packages have these entries in the index. Below is a snapshot from the `sp` package. You can find a lot more there. ![image](https://cloud.githubusercontent.com/assets/11082368/26338918/e8bdd65e-3f38-11e7-83ef-c3293bc267a0.png) Even for S3 methods, they tend to repeat as well. Below is a snapshot of the `gamm4` package. ![image](https://cloud.githubusercontent.com/assets/11082368/26338937/10432bac-3f39-11e7-9b91-5774e33ff7f8.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18033: [SPARK-20807][SQL] Add compression/decompression of colu...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18033 @hvanhovell would it be possible to review this or let us know the appropriate persons for this review? cc @sameeragarwal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17698 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77217/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17698 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17698 **[Test build #77217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77217/testReport)** for PR 17698 at commit [`10be7eb`](https://github.com/apache/spark/commit/10be7eb586dcf992af2982ba94aa446408ad1e25). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18040: [SPARK-20815] [SPARKR] NullPointerException in RP...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18040 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18040: [SPARK-20815] [SPARKR] NullPointerException in RPackageU...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18040 merged to master/2.2, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18051 @actuaryzhang - we were just talking this in the other PR. what do you think? @zero323 - right, I do agree `?abs-method` is kind of a big problem... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18046 **[Test build #77221 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77221/testReport)** for PR 18046 at commit [`e9acb63`](https://github.com/apache/spark/commit/e9acb63e1e695ddab4d80ed74844f2244c3f0e05). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77220 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77220/testReport)** for PR 18064 at commit [`b355c6d`](https://github.com/apache/spark/commit/b355c6d034c6aefcf8f74757353afce870e9bf1d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77219/testReport)** for PR 18064 at commit [`9507f19`](https://github.com/apache/spark/commit/9507f1938f894b2884b024c8472084a3a531e20d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117892629 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -37,6 +37,42 @@ import org.apache.spark.sql.types._ */ private[feature] trait RFormulaBase extends HasFeaturesCol with HasLabelCol { + /** + * Param for how to order categories of a string FEATURE column used by `StringIndexer`. + * The last category after ordering is dropped when encoding strings. + * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 'alphabetAsc'. + * The default value is 'frequencyDesc'. When the ordering is set to 'alphabetDesc', `RFormula` + * drops the same category as R when encoding strings. + * + * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 'b'`: + * {{{ + * +-+---+--+ --- End diff -- @HyukjinKwon Thanks for the clarification. I don't think `list` paints a clear picture here. Would rather keep the table structure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17967 @yanboliang I updated the example in the param doc. I hope it is clear now that it is `alphabetDesc` that drops the same category as R. That is, RFormula with `alphabetDesc` drops the first alphabetic category in string encoding. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18048 **[Test build #77218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77218/testReport)** for PR 18048 at commit [`9af9caf`](https://github.com/apache/spark/commit/9af9caf20f46674eabee2c0ece5ae828d2426a5d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18048 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77215/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77215/testReport)** for PR 18064 at commit [`5486950`](https://github.com/apache/spark/commit/5486950edada8ae87d2586f3f6d1e2d82027b015). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait Command extends LogicalPlan ` * `case class ExecutedCommandExec(cmd: RunnableCommand, children: Seq[SparkPlan]) extends SparkPlan ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17762: [SPARK-9103][WIP] Track Netty memory usage - take...
Github user jsoltren closed the pull request at: https://github.com/apache/spark/pull/17762 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17762: [SPARK-9103][WIP] Track Netty memory usage - take two
Github user jsoltren commented on the issue: https://github.com/apache/spark/pull/17762 To close the loop here: I'm going to rework these ideas into a new JIRA that I'll file, to track *total* memory usage in the UI. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17698 **[Test build #77217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77217/testReport)** for PR 17698 at commit [`10be7eb`](https://github.com/apache/spark/commit/10be7eb586dcf992af2982ba94aa446408ad1e25). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18058 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18058 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17698 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17698 **[Test build #77214 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77214/testReport)** for PR 17698 at commit [`7edfed5`](https://github.com/apache/spark/commit/7edfed5577e8610b4ba42f64979c4168fce829d5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17698 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77214/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18035 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18035 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77216/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18035 **[Test build #77216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77216/testReport)** for PR 18035 at commit [`5d9afe0`](https://github.com/apache/spark/commit/5d9afe06b665464b06705d618a18a8032255fe1d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18058 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/18058 Thanks, @yanboliang. Do you have any suggestion about testing the parameter? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17698 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17698 **[Test build #77213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77213/testReport)** for PR 17698 at commit [`6ce4220`](https://github.com/apache/spark/commit/6ce4220bf861f4a64f3126f1f14043dcb666a056). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17698 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77213/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18048: [SPARK-20399][SQL][Follow-up] Add a config to fallback s...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18048 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18035 **[Test build #77216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77216/testReport)** for PR 18035 at commit [`5d9afe0`](https://github.com/apache/spark/commit/5d9afe06b665464b06705d618a18a8032255fe1d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77215/testReport)** for PR 18064 at commit [`5486950`](https://github.com/apache/spark/commit/5486950edada8ae87d2586f3f6d1e2d82027b015). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Joint coefficients with intercept fo...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18035 @felixcheung I'm OK to keep as it is, thanks for your clarification. What about other changes in this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17698 **[Test build #77214 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77214/testReport)** for PR 17698 at commit [`7edfed5`](https://github.com/apache/spark/commit/7edfed5577e8610b4ba42f64979c4168fce829d5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18054: [SPARK-20763][SQL][Backport-2.1] The function of ...
Github user 10110346 closed the pull request at: https://github.com/apache/spark/pull/18054 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18054: [SPARK-20763][SQL][Backport-2.1] The function of `month`...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18054 Could you close the PR manually? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117878731 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -37,6 +37,42 @@ import org.apache.spark.sql.types._ */ private[feature] trait RFormulaBase extends HasFeaturesCol with HasLabelCol { + /** + * Param for how to order categories of a string FEATURE column used by `StringIndexer`. + * The last category after ordering is dropped when encoding strings. + * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 'alphabetAsc'. + * The default value is 'frequencyDesc'. When the ordering is set to 'alphabetDesc', `RFormula` + * drops the same category as R when encoding strings. + * + * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 'b'`: + * {{{ + * +-+---+--+ --- End diff -- I guess I am not supposed to make a decision call though. Please let me know @felixcheung and @yanboliang if you have any preference. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17941: [SPARK-20684][R] Expose createGlobalTempView and dropGlo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17941 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77211/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18054: [SPARK-20763][SQL][Backport-2.1] The function of `month`...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18054 Thanks, merging to 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17941: [SPARK-20684][R] Expose createGlobalTempView and dropGlo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17941 **[Test build #77211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77211/testReport)** for PR 17941 at commit [`2e43c14`](https://github.com/apache/spark/commit/2e43c147468904633b3c372637af93e9e2282799). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17941: [SPARK-20684][R] Expose createGlobalTempView and dropGlo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17941 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18054: [SPARK-20763][SQL][Backport-2.1] The function of `month`...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18054 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17698 **[Test build #77213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77213/testReport)** for PR 17698 at commit [`6ce4220`](https://github.com/apache/spark/commit/6ce4220bf861f4a64f3126f1f14043dcb666a056). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17984: [ SPARK-20739][CORE][TEST]Supplement the new unit tests ...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/17984 ok, It does not produce value, that is junk code. close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17984: [ SPARK-20739][CORE][TEST]Supplement the new unit...
Github user heary-cao closed the pull request at: https://github.com/apache/spark/pull/17984 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117877363 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -37,6 +37,42 @@ import org.apache.spark.sql.types._ */ private[feature] trait RFormulaBase extends HasFeaturesCol with HasLabelCol { + /** + * Param for how to order categories of a string FEATURE column used by `StringIndexer`. + * The last category after ordering is dropped when encoding strings. + * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 'alphabetAsc'. + * The default value is 'frequencyDesc'. When the ordering is set to 'alphabetDesc', `RFormula` + * drops the same category as R when encoding strings. + * + * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 'b'`: + * {{{ + * +-+---+--+ --- End diff -- Ah, sure, I initially meant a HTML list that we are already using - https://github.com/apache/spark/blob/04901dd03a3f8062fd39ea38d585935ff71a9248/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala#L304-L340 ... ```html abc abc ``` I just tested it to double-check a wiki-style list ( `-` ) - http://subnormalnumbers.blogspot.kr/2011/08/scaladoc-wiki-syntax.html. This does not work correctly as below (but please go ahead if you know any compatible way for both Scaladoc and Javadoc): ``` * 1. item one * * 1. item two *- sublist *- next item * * 1. now for broken sub-numbered list, the leading item must be one of * `-`, `1.`, `I.`, `i.`, `A.`, or `a.`. And it must be followed by a space. *1. one *2. two *3. three * * 1. list types *I. one * i. one * i. two *I. two * A. one * A. two *I. three * a. one * a. two ``` Scaladoc ![2017-05-23 9 52 51](https://cloud.githubusercontent.com/assets/6477701/26334123/b153378a-3f9d-11e7-8852-31b519ec9f21.png) Javadoc ![2017-05-23 9 53 07](https://cloud.githubusercontent.com/assets/6477701/26334121/af2696d2-3f9d-11e7-9924-f0e4373b2cce.png) My worry is, it draws attention with a different format. I believe we have similar instances but wonder if it is worth changing only this one. I would not strongly against but `{{{ ... }}}` basically means codes. If we can't find a better way to render this, I would leave this out as prose with a list. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17967 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77209/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17967 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17967 **[Test build #77209 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77209/testReport)** for PR 17967 at commit [`1a1e06c`](https://github.com/apache/spark/commit/1a1e06c9f1690e0654f78313f674c07da2b6b6f2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18060: [SPARK-20835][Core]It should exit directly when the --to...
Github user eatoncys commented on the issue: https://github.com/apache/spark/pull/18060 @srowen The other parameters are validated at some other places, for example, the --executor-memory parameter is validated at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory, the app can exit with error: "java.lang.NumberFormatException" if it is a negative number. But the --total-executor-cores parameter is not validated at anywhere, the app can not exit if it is a negative number. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17723 **[Test build #77212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77212/testReport)** for PR 17723 at commit [`bf758e6`](https://github.com/apache/spark/commit/bf758e64f699f3211e15767d0f1854cd47cecb29). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17723 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17723 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77212/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18061: [SPARK-20836][LAUNCHER] DRIVER_EXTRA_JAVA_OPTIONS is nee...
Github user wujianping10043419 commented on the issue: https://github.com/apache/spark/pull/18061 @srowen But the check result in the abnormal exit, because an irrelevant parameters. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18015: [SAPRK-20785][WEB-UI][SQL]Spark should provide jump link...
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/18015 @srowen Help to merge to master and modify the status about https://issues.apache.org/jira/browse/SPARK-20785.Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18063: [SPARK-20842] [SQL] Upgrade to 1.2.2 for Hive Metastore ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18063 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17941: [SPARK-20684][R] Expose createGlobalTempView and dropGlo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17941 **[Test build #77211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77211/testReport)** for PR 17941 at commit [`2e43c14`](https://github.com/apache/spark/commit/2e43c147468904633b3c372637af93e9e2282799). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17723 **[Test build #77212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77212/testReport)** for PR 17723 at commit [`bf758e6`](https://github.com/apache/spark/commit/bf758e64f699f3211e15767d0f1854cd47cecb29). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14971 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17723 **[Test build #77210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77210/testReport)** for PR 17723 at commit [`cd58b6c`](https://github.com/apache/spark/commit/cd58b6c3e6d498a756a8e61d2c25d96b352537c1). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17723 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77210/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17723 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14971 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17941: [SPARK-20684][R] Expose createGlobalTempView and dropGlo...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17941 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17723 **[Test build #77210 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77210/testReport)** for PR 17723 at commit [`cd58b6c`](https://github.com/apache/spark/commit/cd58b6c3e6d498a756a8e61d2c25d96b352537c1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18046 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77208/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18046 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18046 **[Test build #77208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77208/testReport)** for PR 18046 at commit [`599fd31`](https://github.com/apache/spark/commit/599fd31ff2a25d4ad88e8533f9c1aa76b00bdd17). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117873500 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -37,6 +37,42 @@ import org.apache.spark.sql.types._ */ private[feature] trait RFormulaBase extends HasFeaturesCol with HasLabelCol { + /** + * Param for how to order categories of a string FEATURE column used by `StringIndexer`. + * The last category after ordering is dropped when encoding strings. + * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 'alphabetAsc'. + * The default value is 'frequencyDesc'. When the ordering is set to 'alphabetDesc', `RFormula` + * drops the same category as R when encoding strings. + * + * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 'b'`: + * {{{ + * +-+---+--+ --- End diff -- @HyukjinKwon Would you please clarify what you mean by a `list`? Thanks. I would like to preserve the table structure because it helps show the difference. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117872391 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -37,6 +37,42 @@ import org.apache.spark.sql.types._ */ private[feature] trait RFormulaBase extends HasFeaturesCol with HasLabelCol { + /** + * Param for how to order categories of a string FEATURE column used by `StringIndexer`. + * The last category after ordering is dropped when encoding strings. + * Supported options: 'frequencyDesc', 'frequencyAsc', 'alphabetDesc', 'alphabetAsc'. + * The default value is 'frequencyDesc'. When the ordering is set to 'alphabetDesc', `RFormula` + * drops the same category as R when encoding strings. + * + * The options are explained using an example `'b', 'a', 'b', 'a', 'c', 'b'`: + * {{{ + * +-+---+--+ --- End diff -- I would like to suggest just to write out as prose with a simple list if we are all fine for now, which I guess we would generally agree with. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18040: [SPARK-20815] [SPARKR] NullPointerException in RPackageU...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18040 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77205/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18040: [SPARK-20815] [SPARKR] NullPointerException in RPackageU...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18040 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18040: [SPARK-20815] [SPARKR] NullPointerException in RPackageU...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18040 **[Test build #77205 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77205/testReport)** for PR 18040 at commit [`7ac474a`](https://github.com/apache/spark/commit/7ac474ae49d0e0dca9fc298805a1867eec222cb6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117871475 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -523,6 +524,14 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { sortTest() } + test("negative in LIMIT or TABLESAMPLE") { --- End diff -- yah, not related to this pr, will make change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org