date:20150826

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135308659
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135308292
  
  [Test build #41671 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41671/console)
 for   PR 7520 at commit 
[`055cd09`](https://github.com/apache/spark/commit/055cd09a09fff47cf43578a19ac78b77610231ce).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135308665
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41671/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/7520#discussion_r38065116
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala ---
@@ -253,7 +260,7 @@ private[orc] case class OrcTableScan(
 maybeStructOI.map { soi =>
   val (fieldRefs, fieldOrdinals) = nonPartitionKeyAttrs.map {
 case (attr, ordinal) =>
-  soi.getStructFieldRef(attr.name.toLowerCase) -> ordinal
--- End diff --

If don't do the normalization, is this the only place we need to change? 
Since both `StructObjectInspector` and `OrcStructObjectInspector` are working 
for the same purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10251][CORE] some common types are not ...

2015-08-26 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8465


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10251][CORE] some common types are not ...

2015-08-26 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8465#issuecomment-135304194
  
Thanks - I've merged this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135304152
  
The normalization is not done by StructObjectInspector or 
OrcStructObjectInspector, but in `SemanticAnalyzer` of Hive. I've checked with 
Hive, even the orc column names are in capital, Hive works well, the only thing 
I am not sure is about the column pruning and predicate push down, seems 
"explain extended select xx" of Hive doesn't give those information, maybe 
@zhzhan can give some comments on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10287] [SQL] Fixes JSONRelation refresh...

2015-08-26 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8469#issuecomment-135302032
  
I will test it with my partitioned JSON table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/8476#discussion_r38063970
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformation.scala
 ---
@@ -222,7 +221,7 @@ private class ScriptTransformationWriterThread(
 
   override def run(): Unit = Utils.logUncaughtExceptions {
 TaskContext.setTaskContext(taskContext)
-
+val newLineCode = 10
--- End diff --

Seems never used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9003] [MLlib] Add mapActive{Pairs,Value...

2015-08-26 Thread feynmanliang

Github user feynmanliang commented on the pull request:

https://github.com/apache/spark/pull/7357#issuecomment-135300832
  
@yanboliang I think there are some [discussions about whether we should 
enrich the Vectors 
API](https://www.mail-archive.com/user@spark.apache.org/msg35434.html)

CC @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135300665
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41674/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135300664
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9003] [MLlib] Add mapActive{Pairs,Value...

2015-08-26 Thread Mageswaran1989

Github user Mageswaran1989 commented on a diff in the pull request:

https://github.com/apache/spark/pull/7357#discussion_r38063538
  
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala 
---
@@ -692,6 +744,29 @@ class SparseVector @Since("1.0.0") (
 
   private[spark] override def toBreeze: BV[Double] = new 
BSV[Double](indices, values, size)
 
+  private[spark] override def mapActiveValues(f: Double => Double): Vector 
=
+new SparseVector(size, indices.clone(), values.map(f))
+
+  private[spark] def mapActivePairs(f: (Int, Double) => Double): Vector = {
--- End diff --

I am learning Scala. Could you please explain how this maps only the active 
pairs? 
what is happening in the line no: 756


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8476#issuecomment-135297803
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8476#issuecomment-135297810
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41667/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8476#issuecomment-135297607
  
  [Test build #41667 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41667/console)
 for   PR 8476 at commit 
[`cafe301`](https://github.com/apache/spark/commit/cafe3013fc8ce8f94cac82eb6378bd8a5c609409).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARKR] [SPARK-10219] Fix varargsToEnv and ad...

2015-08-26 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8475


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135296970
  
  [Test build #41673 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41673/consoleFull)
 for   PR 7883 at commit 
[`54365fc`](https://github.com/apache/spark/commit/54365fca94fc9857f035a0773ffd6ace650105ec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135296539
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135296481
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARKR] [SPARK-10219] Fix varargsToEnv and ad...

2015-08-26 Thread shivaram

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8475#issuecomment-135295820
  
Thanks for taking a look. Merging this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135295552
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135295542
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8444


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread yu-iskw

Github user yu-iskw commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135295340
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9148][SPARK-10252][SQL] Update SQL Prog...

2015-08-26 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8441#issuecomment-135295332
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135295239
  
Thanks - I've merged this in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135294447
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41672/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135294443
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135294397
  
  [Test build #41672 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41672/console)
 for   PR 8118 at commit 
[`acfc9fe`](https://github.com/apache/spark/commit/acfc9fe2bd2dc9903bddfe932b12123861e0aef6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread Mageswaran1989

Github user Mageswaran1989 commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r38062415
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -192,6 +192,12 @@ private[spark] object SQLConf {
   "column based on statistics of the data.",
 isPublic = false)
 
+  val LIMIT_ROWS = longConf("spark.sql.limit.rows",
+defaultValue = Some(10L),
+doc = "For the LIMIT clause, put all of the output rows in a single 
partition " +
+  "iif the required row number less than the threshold, otherwise 
fetch the rows in a " +
--- End diff --

I think iif => is typo


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135293645
  
  [Test build #41664 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41664/console)
 for   PR 7883 at commit 
[`54365fc`](https://github.com/apache/spark/commit/54365fca94fc9857f035a0773ffd6ace650105ec).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135293673
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41664/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135293672
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135291919
  
  [Test build #41672 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41672/consoleFull)
 for   PR 8118 at commit 
[`acfc9fe`](https://github.com/apache/spark/commit/acfc9fe2bd2dc9903bddfe932b12123861e0aef6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135289365
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8118#issuecomment-135289408
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9679][ML][PYSPARK] Add Python API for S...

2015-08-26 Thread holdenk

Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/8118#discussion_r38061775
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -818,6 +818,76 @@ class StringIndexerModel(JavaModel):
 Model fitted by StringIndexer.
 """
 
+class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol):
+"""
+.. note:: Experimental
+
+A feature transformer that filters out stop words from input.
+Note: null values from input array are preserved unless adding null to 
stopWords explicitly.
+"""
+# a placeholder to make the stopwords show up in generated doc
+stopWords = Param(Params._dummy(), "stopWords", "The words to be 
filtered out")
+caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do 
a case sensitive " +
+  "comparison over the stop words")
+
+@keyword_only
+def __init__(self, inputCol=None, outputCol=None, stopWords=None,
+ caseSensitive=False):
+"""
+__init__(self, inputCol=None, outputCol=None, stopWords=None,
+ caseSensitive=false)
+"""
+super(StopWordsRemover, self).__init__()
+self._java_obj = 
self._new_java_obj("org.apache.spark.ml.feature.StopWordsRemover",
+self.uid)
+self.stopWords = Param(self, "stopWords", "The words to be 
filtered out")
+self.caseSensitive = Param(self._dummy(), "caseSensitive", 
"whether to do a case " +
+   "sensitive comparison over the stop 
words")
+stopWordsObj = _jvm().org.apache.spark.ml.feature.StopWords
+defaultStopWords = stopWordsObj.ENGLISH_STOP_WORDS()
+print "Constructing java param pair for value 
"+str(defaultStopWords)
--- End diff --

oh no, I was checking the type when debugging something


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135289064
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135289065
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41670/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135289012
  
  [Test build #41670 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41670/console)
 for   PR 8477 at commit 
[`3211a68`](https://github.com/apache/spark/commit/3211a68039b9886e31e6aabf00d6de335f81b4f6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135286827
  
  [Test build #41671 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41671/consoleFull)
 for   PR 7520 at commit 
[`055cd09`](https://github.com/apache/spark/commit/055cd09a09fff47cf43578a19ac78b77610231ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135286166
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135286177
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9170][SQL] Instead of StandardStructObj...

2015-08-26 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/7520#issuecomment-135285964
  
* This patch doesn't claim performance improvement. I know that 
`OrcStructInspector` actually do very similar thing as `StructObjectInspector`. 
This patch is proposed to only deal with the lowercase problem.

* Although Hive is a case-insensitive system and we just use lowercases in 
the querying, I just wonder if we need to normalise the column names to 
lowercases in persistence (such as ORC). Because according to 
`OrcStructInspector`, I don't see it automatically perform the normalisation. 
And you can find that we still keep the case-insensitive behavior when doing 
query (I will add it in the unit test). In other words, this patch only 
prevents automatically modifying the schema when serialising to ORC files.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...

2015-08-26 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8453#issuecomment-135285913
  
@cloud-fan Thank you for working on it. Since we are pretty close to 1.5 
release, let's wait until the release and get this merged at the beginning of 
1.6 release cycle.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8333#issuecomment-135285441
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8333#issuecomment-135285443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41669/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: test

2015-08-26 Thread hustnn

Github user hustnn closed the pull request at:

https://github.com/apache/spark/pull/8478


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8333#issuecomment-135285397
  
  [Test build #41669 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41669/console)
 for   PR 8333 at commit 
[`209e94b`](https://github.com/apache/spark/commit/209e94b8bfaed7063d79266b442db893b2e5).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: test

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8478#issuecomment-135285391
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: test

2015-08-26 Thread hustnn

GitHub user hustnn opened a pull request:

https://github.com/apache/spark/pull/8478

test



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hustnn/spark-adaptive-scheduling master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8478.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8478


commit 6e294733bf871a14b515ff502271f2aa04db9647
Author: hustnn 
Date:   2015-08-27T03:56:04Z

record size of each reduce task

commit b86c42b80d23b195ad3bc9a186290ffa7f817421
Author: hustnn 
Date:   2015-08-27T03:59:37Z

Revert "record size of each reduce task"

This reverts commit 6e294733bf871a14b515ff502271f2aa04db9647.

commit be9db41f77228762ba01492b0f986435dd87d790
Author: hustnn 
Date:   2015-08-27T04:00:55Z

record size of each reduce partition




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135284113
  
  [Test build #41670 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41670/consoleFull)
 for   PR 8477 at commit 
[`3211a68`](https://github.com/apache/spark/commit/3211a68039b9886e31e6aabf00d6de335f81b4f6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135283573
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135283336
  
  [Test build #41668 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41668/console)
 for   PR 8444 at commit 
[`b2d072d`](https://github.com/apache/spark/commit/b2d072d4b089646591deb27eca97f989c4c5be7b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135283576
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41668/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARKR] [SPARK-10219] Fix varargsToEnv and ad...

2015-08-26 Thread davies

Github user davies commented on the pull request:

https://github.com/apache/spark/pull/8475#issuecomment-135283021
  
I forgot that, the current approach looks much better. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9793] [MLlib] [PySpark] PySpark DenseVe...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8166#issuecomment-135282553
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9793] [MLlib] [PySpark] PySpark DenseVe...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8166#issuecomment-135282555
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41666/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9793] [MLlib] [PySpark] PySpark DenseVe...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8166#issuecomment-135282290
  
  [Test build #41666 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41666/console)
 for   PR 8166 at commit 
[`d63d54e`](https://github.com/apache/spark/commit/d63d54e835d82cac279b9a6b896b99f03c073ef8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135282056
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135282013
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8333#issuecomment-135280626
  
  [Test build #41669 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41669/consoleFull)
 for   PR 8333 at commit 
[`209e94b`](https://github.com/apache/spark/commit/209e94b8bfaed7063d79266b442db893b2e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/8477

[SPARK-10311]Reload appId and attemptId when a new ApplicationMaster 
registes



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark streaming-attempt

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8477.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8477


commit 3211a68039b9886e31e6aabf00d6de335f81b4f6
Author: xutingjun 
Date:   2015-08-27T03:31:03Z

reload appId and attemptId when AM is new




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-135279946
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41658/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8333#issuecomment-135279912
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-135279945
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8333#issuecomment-135279919
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-135279884
  
  [Test build #41658 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41658/console)
 for   PR 8437 at commit 
[`090e1d4`](https://github.com/apache/spark/commit/090e1d4f67c89b9f272ef80e0c1a76a836813bb1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7544] [SQL] [PySpark] pyspark.sql.types...

2015-08-26 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/8333#discussion_r38059443
  
--- Diff: python/pyspark/sql/types.py ---
@@ -1233,6 +1235,19 @@ def __call__(self, *args):
 """create new Row object"""
 return _create_row(self, args)
 
+def __getitem__(self, item):
+if isinstance(item, (int, slice)):
+return super(Row, self).__getitem__(item)
+try:
+# it will be slow when it has many fields,
+# but this will not be used in normal cases
+idx = self.__fields__.index(item)
+return super(Row, self).__getitem__(idx)
+except IndexError:
+raise AttributeError(item)
--- End diff --

I agree.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-135278321
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41657/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-135278319
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-135277681
  
  [Test build #41657 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41657/console)
 for   PR 8437 at commit 
[`131e658`](https://github.com/apache/spark/commit/131e6586cefa80827221fdd579838ed32f6c412d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135277607
  
  [Test build #41668 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41668/consoleFull)
 for   PR 8444 at commit 
[`b2d072d`](https://github.com/apache/spark/commit/b2d072d4b089646591deb27eca97f989c4c5be7b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9793] [MLlib] [PySpark] PySpark DenseVe...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8166#issuecomment-135277188
  
  [Test build #41666 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41666/consoleFull)
 for   PR 8166 at commit 
[`d63d54e`](https://github.com/apache/spark/commit/d63d54e835d82cac279b9a6b896b99f03c073ef8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135276021
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9964] [PySpark] [SQL] PySpark DataFrame...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8444#issuecomment-135275967
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8476#issuecomment-135274867
  
  [Test build #41667 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41667/consoleFull)
 for   PR 8476 at commit 
[`cafe301`](https://github.com/apache/spark/commit/cafe3013fc8ce8f94cac82eb6378bd8a5c609409).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9793] [MLlib] [PySpark] PySpark DenseVe...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8166#issuecomment-135273565
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9793] [MLlib] [PySpark] PySpark DenseVe...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8166#issuecomment-135273520
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8476#issuecomment-135273566
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8476#issuecomment-135273518
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread zhichao-li

GitHub user zhichao-li opened a pull request:

https://github.com/apache/spark/pull/8476

[SPARK-10310][SQL]Using \t as the field delimeter and \n as the line 
delimeter

Currently we are using `LazySimpleSerDe` to serialize the script input by 
default. but it would use '\001' not the same as hive.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhichao-li/spark delim

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8476.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8476


commit cafe3013fc8ce8f94cac82eb6378bd8a5c609409
Author: zhichao.li 
Date:   2015-08-27T03:12:29Z

tab as the field delimeter and 10 as the line delimeter




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8673] [launcher] API and infrastructure...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7052#issuecomment-135272582
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41656/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8673] [launcher] API and infrastructure...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7052#issuecomment-135272348
  
  [Test build #41656 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41656/console)
 for   PR 7052 at commit 
[`8608f56`](https://github.com/apache/spark/commit/8608f561be5a2ea3bc8981e9756ae127f4bcdd98).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class ChildProcAppHandle implements SparkAppHandle `
  * `abstract class LauncherConnection implements Closeable, Runnable `
  * `final class LauncherProtocol `
  * `  static class Message implements Serializable `
  * `  static class Hello extends Message `
  * `  static class SetAppId extends Message `
  * `  static class SetState extends Message `
  * `  static class Stop extends Message `
  * `class LauncherServer implements Closeable `
  * `class NamedThreadFactory implements ThreadFactory `
  * `class OutputRedirector `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8673] [launcher] API and infrastructure...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7052#issuecomment-135272579
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9910][ML]User guide for train validatio...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8377#issuecomment-135270185
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7685][ML] Apply weights to different sa...

2015-08-26 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7884#discussion_r38058812
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -218,31 +217,59 @@ class LogisticRegression(override val uid: String)
 
   override def getThreshold: Double = super.getThreshold
 
+  /**
+   * Whether to over-/undersamples each of training sample according to 
the given
+   * weight in `weightCol`. If empty, all samples are supposed to have 
weights as 1.0.
+   * Default is empty, so all samples have weight one.
+   * @group setParam
+   */
+  def setWeightCol(value: String): this.type = set(weightCol, value)
+  setDefault(weightCol -> "")
+
   override def setThresholds(value: Array[Double]): this.type = 
super.setThresholds(value)
 
   override def getThresholds: Array[Double] = super.getThresholds
 
   override protected def train(dataset: DataFrame): 
LogisticRegressionModel = {
 // Extract columns from data.  If dataset is persisted, do not persist 
oldDataset.
-val instances = extractLabeledPoints(dataset).map {
-  case LabeledPoint(label: Double, features: Vector) => (label, 
features)
-}
+val instances: Either[RDD[(Double, Vector)], RDD[(Double, Double, 
Vector)]] =
+  if ($(weightCol).isEmpty) {
+Left(dataset.select($(labelCol), $(featuresCol)).map {
+  case Row(label: Double, features: Vector) => (label, features)
+})
+  } else {
+Right(dataset.select($(labelCol), $(weightCol), 
$(featuresCol)).map {
+  case Row(label: Double, weight: Double, features: Vector) =>
+(label, weight, features)
+})
+  }
+
 val handlePersistence = dataset.rdd.getStorageLevel == 
StorageLevel.NONE
-if (handlePersistence) instances.persist(StorageLevel.MEMORY_AND_DISK)
-
-val (summarizer, labelSummarizer) = instances.treeAggregate(
-  (new MultivariateOnlineSummarizer, new MultiClassSummarizer))(
-seqOp = (c, v) => (c, v) match {
-  case ((summarizer: MultivariateOnlineSummarizer, 
labelSummarizer: MultiClassSummarizer),
-  (label: Double, features: Vector)) =>
-(summarizer.add(features), labelSummarizer.add(label))
-},
-combOp = (c1, c2) => (c1, c2) match {
-  case ((summarizer1: MultivariateOnlineSummarizer,
-  classSummarizer1: MultiClassSummarizer), (summarizer2: 
MultivariateOnlineSummarizer,
-  classSummarizer2: MultiClassSummarizer)) =>
-(summarizer1.merge(summarizer2), 
classSummarizer1.merge(classSummarizer2))
-  })
+if (handlePersistence) instances.fold(identity, 
identity).persist(StorageLevel.MEMORY_AND_DISK)
+
+val (summarizer, labelSummarizer) = {
+  val combOp = (c1: (MultivariateOnlineSummarizer, 
MultiClassSummarizer),
+c2: (MultivariateOnlineSummarizer, MultiClassSummarizer)) =>
+  (c1._1.merge(c2._1), c1._2.merge(c2._2))
+
+  instances match {
--- End diff --

This is not working due to some type issue.
https://cloud.githubusercontent.com/assets/1134574/9511773/f3b99dc6-4c2e-11e5-8d6e-e421907ebf41.png";>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9910][ML]User guide for train validatio...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8377#issuecomment-135269856
  
  [Test build #41665 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41665/console)
 for   PR 8377 at commit 
[`1dd1cd1`](https://github.com/apache/spark/commit/1dd1cd11d7abe72973d6339e028c55ac695e8fb9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class JavaTrainValidationSplitExample `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9910][ML]User guide for train validatio...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8377#issuecomment-135270190
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41665/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8796][SQL] make sure SparkPlan is only ...

2015-08-26 Thread cloud-fan

Github user cloud-fan closed the pull request at:

https://github.com/apache/spark/pull/7192


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8951][SparkR] support Unicode character...

2015-08-26 Thread sun-rui

Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/7494#issuecomment-135265160
  
@CHOIJAEHONG1 , basically LGTM. Some minor comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7685][ML] Apply weights to different sa...

2015-08-26 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7884#discussion_r38058578
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -218,31 +217,59 @@ class LogisticRegression(override val uid: String)
 
   override def getThreshold: Double = super.getThreshold
 
+  /**
+   * Whether to over-/undersamples each of training sample according to 
the given
+   * weight in `weightCol`. If empty, all samples are supposed to have 
weights as 1.0.
+   * Default is empty, so all samples have weight one.
+   * @group setParam
+   */
+  def setWeightCol(value: String): this.type = set(weightCol, value)
+  setDefault(weightCol -> "")
+
   override def setThresholds(value: Array[Double]): this.type = 
super.setThresholds(value)
 
   override def getThresholds: Array[Double] = super.getThresholds
 
   override protected def train(dataset: DataFrame): 
LogisticRegressionModel = {
 // Extract columns from data.  If dataset is persisted, do not persist 
oldDataset.
-val instances = extractLabeledPoints(dataset).map {
-  case LabeledPoint(label: Double, features: Vector) => (label, 
features)
-}
+val instances: Either[RDD[(Double, Vector)], RDD[(Double, Double, 
Vector)]] =
+  if ($(weightCol).isEmpty) {
+Left(dataset.select($(labelCol), $(featuresCol)).map {
+  case Row(label: Double, features: Vector) => (label, features)
+})
+  } else {
+Right(dataset.select($(labelCol), $(weightCol), 
$(featuresCol)).map {
+  case Row(label: Double, weight: Double, features: Vector) =>
+(label, weight, features)
+})
+  }
+
 val handlePersistence = dataset.rdd.getStorageLevel == 
StorageLevel.NONE
-if (handlePersistence) instances.persist(StorageLevel.MEMORY_AND_DISK)
-
-val (summarizer, labelSummarizer) = instances.treeAggregate(
-  (new MultivariateOnlineSummarizer, new MultiClassSummarizer))(
-seqOp = (c, v) => (c, v) match {
-  case ((summarizer: MultivariateOnlineSummarizer, 
labelSummarizer: MultiClassSummarizer),
-  (label: Double, features: Vector)) =>
-(summarizer.add(features), labelSummarizer.add(label))
-},
-combOp = (c1, c2) => (c1, c2) match {
-  case ((summarizer1: MultivariateOnlineSummarizer,
-  classSummarizer1: MultiClassSummarizer), (summarizer2: 
MultivariateOnlineSummarizer,
-  classSummarizer2: MultiClassSummarizer)) =>
-(summarizer1.merge(summarizer2), 
classSummarizer1.merge(classSummarizer2))
-  })
+if (handlePersistence) instances.fold(identity, 
identity).persist(StorageLevel.MEMORY_AND_DISK)
+
+val (summarizer, labelSummarizer) = {
+  val combOp = (c1: (MultivariateOnlineSummarizer, 
MultiClassSummarizer),
+c2: (MultivariateOnlineSummarizer, MultiClassSummarizer)) =>
+  (c1._1.merge(c2._1), c1._2.merge(c2._2))
+
+  instances match {
--- End diff --

Good point! Gonna to change to this style. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8951][SparkR] support Unicode character...

2015-08-26 Thread sun-rui

Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/7494#discussion_r38058562
  
--- Diff: R/pkg/inst/tests/test_sparkSQL.R ---
@@ -417,6 +417,32 @@ test_that("collect() and take() on a DataFrame return 
the same number of rows an
   expect_equal(ncol(collect(df)), ncol(take(df, 10)))
 })
 
+test_that("collect() support Unicode characters", {
+  markUtf8 <- function(s) {
+Encoding(s) <- "UTF-8"
+s
+  }
+
+  lines <- c("{\"name\":\"ìëíì¸ì\"}",
+ "{\"name\":\"æ¨å¥½\", \"age\":30}",
+ "{\"name\":\"ããã«ã¡ã¯\", \"age\":19}",
+ "{\"name\":\"Xin chÃ o\"}")
+
--- End diff --

Still a little bit confused about the behavior of treating unicode string 
in non-UTF8 local. Why no need to makeUtf8 for these unicode strings?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7685][ML] Apply weights to different sa...

2015-08-26 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7884#discussion_r38058517
  
--- Diff: project/MimaExcludes.scala ---
@@ -60,6 +60,10 @@ object MimaExcludes {
   "org.apache.spark.ml.regression.LeastSquaresCostFun.this"),
 ProblemFilters.exclude[MissingMethodProblem](
   "org.apache.spark.ml.classification.LogisticCostFun.this"),
+ProblemFilters.exclude[MissingMethodProblem](
--- End diff --

That will work. But it's private class, and it's only used here, so I don't 
think it's necessary to preserve the original signature. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9910][ML]User guide for train validatio...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8377#issuecomment-135263765
  
  [Test build #41665 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41665/consoleFull)
 for   PR 8377 at commit 
[`1dd1cd1`](https://github.com/apache/spark/commit/1dd1cd11d7abe72973d6339e028c55ac695e8fb9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9910][ML]User guide for train validatio...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8377#issuecomment-135263522
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9910][ML]User guide for train validatio...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8377#issuecomment-135263511
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6763][SQL] Add CountMinSketch to DataFr...

2015-08-26 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/6416#issuecomment-135263455
  
@rxin Is it good to have this as an aggregation function too? If so, I will 
update it too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8505][SparkR] Add settings to kick `lin...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7883#issuecomment-135263294
  
  [Test build #41664 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41664/consoleFull)
 for   PR 7883 at commit 
[`54365fc`](https://github.com/apache/spark/commit/54365fca94fc9857f035a0773ffd6ace650105ec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 663 matches

Mail list logo