date:20160518

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13045#issuecomment-220242854
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58850/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15114][SQL] Column name generated by ty...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13045#issuecomment-220242851
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15114][SQL] Column name generated by ty...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13045#issuecomment-220242701
  
**[Test build #58850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58850/consoleFull)**
 for PR 13045 at commit 
[`9eb6f40`](https://github.com/apache/spark/commit/9eb6f4063adaf7cda79cdf0bf2ac11414ca5c1d2).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-18 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13135


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-18 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/13135#issuecomment-220241901
  
LGTM too. Merged to master/branch-2.0. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13188#issuecomment-220241617
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58844/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13188#issuecomment-220241614
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13188#issuecomment-220241461
  
**[Test build #58844 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58844/consoleFull)**
 for PR 13188 at commit 
[`e584575`](https://github.com/apache/spark/commit/e584575bb786e77b7ea1d6de3f80ec556011d291).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  |   and i_class in('personal', 'portable', 
'reference', 'self-help')`
  * `  |   and i_class in('accessories', 'classical', 
'fragrances', 'pants')`
  * `  |and i_class in('personal', 'portable', 
'refernece', 'self-help')`
  * `  |and i_class in('accessories', 'classical', 
'fragrances', 'pants')`
  * `  |  and i_class in('wallpaper', 'parenting', 
'musical'))`
  * `  |and i_class in('womens', 'birdal', 
'pants'))`
  * `  i_class IN ('personal', 'portable', 'reference', 'self-help') 
AND`
  * `i_class IN ('accessories', 'classical', 'fragrances', 'pants') 
AND`
  * `  AND i_class IN ('personal', 'portable', 'refernece', 'self-help')`
  * `  AND i_class IN ('accessories', 'classical', 'fragrances', 'pants')`
  * `   i_class IN ('computers', 'stereo', 'football'))`
  * `   i_class IN ('shirts', 'birdal', 'dresses')))`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13189#issuecomment-220238181
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58848/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13189#issuecomment-220238179
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13189#issuecomment-220238104
  
**[Test build #58848 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58848/consoleFull)**
 for PR 13189 at commit 
[`8db358f`](https://github.com/apache/spark/commit/8db358f801f3dbd9f5eacf20dc10ef773c0d7ccb).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class SparkListenerDriverAccumUpdates(executionId: Long, 
accumUpdates: Seq[AccumulableInfo])`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15398][ML] Update the warning message t...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13190#issuecomment-220237794
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58851/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15398][ML] Update the warning message t...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13190#issuecomment-220237792
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15398][ML] Update the warning message t...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13190#issuecomment-220237703
  
**[Test build #58851 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58851/consoleFull)**
 for PR 13190 at commit 
[`c6f3244`](https://github.com/apache/spark/commit/c6f324459204ab791ea1b7fa409080105a0301ee).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11827] [SQL] Adding java.math.BigIntege...

2016-05-18 Thread kevinyu98

Github user kevinyu98 commented on the pull request:

https://github.com/apache/spark/pull/10125#issuecomment-220237290
  
@cloud-fan I tried, and it still fail. It didn't go through the 
createDataFrame you added in SparkSession. 
It went with  this createDataFrame(data: java.util.List[_], beanClass: 
Class[_]): DataFrame 
-> val rows = SQLContext.beansToRows(data.asScala.iterator, beanInfo, 
attrSeq)

the beanToRows will create internal rows and it is from SQLContext. 

Should we add RowEncoder into the beansToRows call or leave the code as it 
is ? Thanks.

here is the trace

scala.MatchError: 1234567 (of class java.math.BigInteger)
at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toCatalystImpl(CatalystTypeConverters.scala:326)
at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toCatalystImpl(CatalystTypeConverters.scala:323)
at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102)
at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:401)
at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:892)
at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:892)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:892)
at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:890)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$class.toStream(Iterator.scala:1322)
at scala.collection.AbstractIterator.toStream(Iterator.scala:1336)
at 
scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298)
at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336)
at 
org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
at 
test.org.apache.spark.sql.JavaDataFrameSuite.testCreateDataFrameFromLocalJavaBeans(JavaDataFrameSuite.java:200)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8603][SPARKR] Incorrect file separator ...

2016-05-18 Thread sun-rui

Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/13165#issuecomment-220237304
  
@felixcheung, this issue seems to relate to system2() only. However, let's 
wait for HyukjinKwon's test result.

@HyukjinKwon, great, go ahead please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15398][ML] Update the warning message t...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13190#issuecomment-220236395
  
**[Test build #58851 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58851/consoleFull)**
 for PR 13190 at commit 
[`c6f3244`](https://github.com/apache/spark/commit/c6f324459204ab791ea1b7fa409080105a0301ee).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15398][ML] Update the warning message t...

2016-05-18 Thread zhengruifeng

GitHub user zhengruifeng opened a pull request:

https://github.com/apache/spark/pull/13190

[SPARK-15398][ML] Update the warning message to recommend ML usage

## What changes were proposed in this pull request?
MLlib are not recommended to use, and some methods are even deprecated.
Update the warning message to recommend ML usage.
```
  def showWarning() {
System.err.println(
  """WARN: This is a naive implementation of Logistic Regression and is 
given as an example!
|Please use either 
org.apache.spark.mllib.classification.LogisticRegressionWithSGD or
|org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
|for more conventional use.
  """.stripMargin)
  }
```
To
```
  def showWarning() {
System.err.println(
  """WARN: This is a naive implementation of Logistic Regression and is 
given as an example!
|Please use org.apache.spark.ml.classification.LogisticRegression
|for more conventional use.
  """.stripMargin)
  }
```


## How was this patch tested?
local build




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhengruifeng/spark update_recd

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13190.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13190


commit c6f324459204ab791ea1b7fa409080105a0301ee
Author: Zheng RuiFeng 
Date:   2016-05-19T06:00:59Z

create pr




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13188#discussion_r63826284
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/tpcds/TPCDSQueryBenchmark.scala
 ---
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.parquet.tpcds
+
+import java.io.File
+
+import org.apache.spark.{SparkConf, SparkContext}
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
+import org.apache.spark.sql.catalyst.util._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.util.Benchmark
+
+/**
+ * Benchmark to measure TPCDS query performance.
+ * To run this:
+ *  spark-submit --class  --jars 
+ */
+object TPCDSQueryBenchmark {
+  val conf = new SparkConf()
+  conf.set("spark.sql.parquet.compression.codec", "snappy")
+  conf.set("spark.sql.shuffle.partitions", "4")
+  conf.set("spark.driver.memory", "3g")
+  conf.set("spark.executor.memory", "3g")
+  conf.set("spark.sql.autoBroadcastJoinThreshold", (20 * 1024 * 
1024).toString)
+
+  val sc = new SparkContext("local[1]", "test-sql-context", conf)
+  val sqlContext = new SQLContext(sc)
--- End diff --

Hi, @sameeragarwal !
This PR looks great. By the way, could you update line 36~44 with new 
`SparkSession` builder pattern?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220233909
  
@cloud-fan . 
Now, it's ready again.
Could you merge this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15114][SQL] Column name generated by ty...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13045#issuecomment-220233152
  
**[Test build #58850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58850/consoleFull)**
 for PR 13045 at commit 
[`9eb6f40`](https://github.com/apache/spark/commit/9eb6f4063adaf7cda79cdf0bf2ac11414ca5c1d2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13189#issuecomment-220233150
  
**[Test build #58848 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58848/consoleFull)**
 for PR 13189 at commit 
[`8db358f`](https://github.com/apache/spark/commit/8db358f801f3dbd9f5eacf20dc10ef773c0d7ccb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13186#issuecomment-220233159
  
**[Test build #58849 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58849/consoleFull)**
 for PR 13186 at commit 
[`ac3aa33`](https://github.com/apache/spark/commit/ac3aa334b59d430ea7c239c706ed7e490af5f0b2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346][SQL] Lists unsupported Hive feat...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13173#issuecomment-220232881
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346][SQL] Lists unsupported Hive feat...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13173#issuecomment-220232882
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58845/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346][SQL] Lists unsupported Hive feat...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13173#issuecomment-220232874
  
**[Test build #58845 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58845/consoleFull)**
 for PR 13173 at commit 
[`ca22d71`](https://github.com/apache/spark/commit/ca22d7102537bd7411f37aa957f877802ebd6d17).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

2016-05-18 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/13189#issuecomment-220232797
  
cc @andrewor14 @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

2016-05-18 Thread cloud-fan

GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/13189

[SPARK-14670][SQL][WIP] allow updating driver side sql metrics

## What changes were proposed in this pull request?

On the SparkUI right now we have this SQLTab that displays accumulator 
values per operator. However, it only displays metrics updated on the 
executors, not on the driver. It is useful to also include driver metrics, e.g. 
broadcast time.

This is a different version from 
https://github.com/apache/spark/pull/12427. This PR sends driver side 
accumulator updates right after the updating happens, not at the end of 
execution. But it has some drawback:

1. If there is no update, we won't send zero value updates, and in web UI 
the operator will be empty, no metrics info in displayed.
2. We need to trigger the event explicitly, not as simply as just update 
the accumulator.
3. maybe hard to use it inside whole stage codegen.

## How was this patch tested?

TODO


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13189.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13189


commit 8db358f801f3dbd9f5eacf20dc10ef773c0d7ccb
Author: Wenchen Fan 
Date:   2016-05-19T05:36:34Z

allow updating driver side sql metrics




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13186#issuecomment-220232622
  
**[Test build #58847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58847/consoleFull)**
 for PR 13186 at commit 
[`5bcef84`](https://github.com/apache/spark/commit/5bcef84700bd4ec51097e58bea099ded54334a59).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346][SQL] Lists unsupported Hive feat...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13173#issuecomment-220232037
  
**[Test build #58845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58845/consoleFull)**
 for PR 13173 at commit 
[`ca22d71`](https://github.com/apache/spark/commit/ca22d7102537bd7411f37aa957f877802ebd6d17).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15308][SQL] RowEncoder should preserve ...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13090#issuecomment-220232042
  
**[Test build #58846 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58846/consoleFull)**
 for PR 13090 at commit 
[`698c261`](https://github.com/apache/spark/commit/698c2619dc71650ef0faac278014b539387fb273).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346][SQL] Lists unsupported Hive feat...

2016-05-18 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/13173#issuecomment-220231882
  
Doesn't seem to be a valid MiMA check failure. Actually the tool crashed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346][SQL] Lists unsupported Hive feat...

2016-05-18 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/13173#issuecomment-220231892
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220231390
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58843/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220231329
  
**[Test build #58843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58843/consoleFull)**
 for PR 13139 at commit 
[`e0079d0`](https://github.com/apache/spark/commit/e0079d03f279dc68eb19faed6d5cb6823802051a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220231389
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220231132
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58841/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220231131
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220230990
  
**[Test build #58841 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58841/consoleFull)**
 for PR 13122 at commit 
[`84aa14a`](https://github.com/apache/spark/commit/84aa14a5deda14083520e8e23f83cdb7f5bbb2bc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15322][SQL][FOLLOW-UP] Update deprecate...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13187#issuecomment-220230863
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58840/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15322][SQL][FOLLOW-UP] Update deprecate...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13187#issuecomment-220230862
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13188#issuecomment-220230908
  
**[Test build #58844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58844/consoleFull)**
 for PR 13188 at commit 
[`e584575`](https://github.com/apache/spark/commit/e584575bb786e77b7ea1d6de3f80ec556011d291).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15078][SQL] Add all TPCDS 1.4 benchmark...

2016-05-18 Thread sameeragarwal

GitHub user sameeragarwal opened a pull request:

https://github.com/apache/spark/pull/13188

[SPARK-15078][SQL] Add all TPCDS 1.4 benchmark queries for SparkSQL

## What changes were proposed in this pull request?

Now that SparkSQL supports all TPC-DS queries, this patch adds all 99 
benchmark queries inside SparkSQL.

## How was this patch tested?

Benchmark only

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sameeragarwal/spark tpcds-all

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13188.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13188


commit e584575bb786e77b7ea1d6de3f80ec556011d291
Author: Sameer Agarwal 
Date:   2016-05-03T00:28:12Z

Add all TPCDS queries




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15322][SQL][FOLLOW-UP] Update deprecate...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13187#issuecomment-220230733
  
**[Test build #58840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58840/consoleFull)**
 for PR 13187 at commit 
[`9b07d09`](https://github.com/apache/spark/commit/9b07d09301e9c6695e3586e06852f679594d988d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-15078] Add all TPCDS 1.4 benchmark...

2016-05-18 Thread sameeragarwal

Github user sameeragarwal closed the pull request at:

https://github.com/apache/spark/pull/12854


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220230272
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58839/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220230271
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220230302
  
**[Test build #58843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58843/consoleFull)**
 for PR 13139 at commit 
[`e0079d0`](https://github.com/apache/spark/commit/e0079d03f279dc68eb19faed6d5cb6823802051a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220230146
  
**[Test build #58839 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58839/consoleFull)**
 for PR 13122 at commit 
[`0702178`](https://github.com/apache/spark/commit/0702178a3c485aa316d5b03b3aefb2ea4a228cc2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220229725
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220229727
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58842/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220229641
  
**[Test build #58842 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58842/consoleFull)**
 for PR 13139 at commit 
[`ce7c55e`](https://github.com/apache/spark/commit/ce7c55e14a76dc85bca51a2563d770e3eac3a2a2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-18 Thread frreiss

Github user frreiss commented on a diff in the pull request:

https://github.com/apache/spark/pull/13155#discussion_r63823201
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1648,16 +1648,56 @@ object RewriteCorrelatedScalarSubquery extends 
Rule[LogicalPlan] {
   }
 
   /**
+   * Statically evaluate an expression containing one or more aggregates 
on an empty input.
+   */
+  private def evalOnZeroTups(expr : Expression) : Option[Any] = {
+// AggregateExpressions are Unevaluable, so we need to replace all 
aggregates
+// in the expression with the value they would return for zero input 
tuples.
+val rewrittenExpr = expr transform {
+  case a @ AggregateExpression(aggFunc, _, _, resultId) =>
+val resultLit = aggFunc.defaultResult match {
+  case Some(lit) => lit
+  case None => Literal.default(NullType)
+}
+Alias(resultLit, "aggVal") (exprId = resultId)
+}
+Option(rewrittenExpr.eval())
+  }
+
+  /**
* Construct a new child plan by left joining the given subqueries to a 
base plan.
*/
   private def constructLeftJoins(
   child: LogicalPlan,
   subqueries: ArrayBuffer[ScalarSubquery]): LogicalPlan = {
 subqueries.foldLeft(child) {
   case (currentChild, ScalarSubquery(query, conditions, _)) =>
+val aggOutputExpr = 
query.asInstanceOf[Aggregate].aggregateExpressions.head
--- End diff --

Sorry, didn't see your reply before I posted mine. I must not have 
refreshed my browser. Thanks for the info on the possible cases. I'm testing 
the updated static evaluation code now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220228793
  
**[Test build #58842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58842/consoleFull)**
 for PR 13139 at commit 
[`ce7c55e`](https://github.com/apache/spark/commit/ce7c55e14a76dc85bca51a2563d770e3eac3a2a2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

2016-05-18 Thread sethah

Github user sethah commented on the pull request:

https://github.com/apache/spark/pull/13139#issuecomment-220228661
  
@yanboliang @MLnick Thanks for the feedback. For now, I've just addressed 
the comment about the optimization section. I'll address the other comments in 
my next commit (very soon!).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15186][ML][DOCS] Add user guide for gen...

2016-05-18 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/13139#discussion_r63823104
  
--- Diff: docs/ml-classification-regression.md ---
@@ -374,6 +374,197 @@ regression model and extracting model summary 
statistics.
 
 
 
+## Generalized linear regression
+
+When working with data that has a relatively small number of features (< 
4096), Spark's GeneralizedLinearRegression interface
+allows for flexible specification of [generalized linear 
models](https://en.wikipedia.org/wiki/Generalized_linear_model) (GLMs) which 
can be used for various types of
+prediction problems including linear regression, Poisson regression, 
logistic regression, and others.
+
+Contrasted with linear regression where the output is assumed to follow a 
Gaussian
+distribution, GLMs are specifications of linear models where the response 
variable $Y_i$ may take on _any_
+distribution from the [exponential family of 
distributions](https://en.wikipedia.org/wiki/Exponential_family). 
+
+$$
+Y_i \sim f\left(\cdot|\theta_i, \phi, w_i\right)
+$$
+
+An exponential family distribution is any probability distribution of the 
form
+
+$$
+f\left(y|\theta, \phi, w\right) = \exp{\left(\frac{y\theta - 
b(\theta)}{\phi/w} - c(y, \phi)\right)}
+$$
+
+where the parameter of interest $\theta_i$ is related to the expected 
value of the response variable
+$\mu_i$ by
+
+$$
+\theta_i = h(\mu_i)
+$$
+
+Here, $h(\mu_i)$ is defined by the form of the exponential family 
distribution used. GLMs also allow specification
+of a link function, which defines the relationship between the expected 
value of the response variable $\mu_i$
+and the so called _linear predictor_ $\eta_i$:
+
+$$
+g(\mu_i) = \eta_i = \vec{x_i}^T \cdot \vec{\beta}
+$$
+
+Often, the link function is chosen such that $h(\mu) = g(\mu)$, which 
yields a simplified relationship
+between the parameter of interest $\theta$ and the linear predictor 
$\eta$. In this case, the link
+function $g(\mu)$ is said to be the "canonical" link function.
+
+$$
+\theta_i = h(g^{-1}(\eta_i)) = \eta_i
+$$
+
+A GLM finds the regression coefficients $\vec{\beta}$ which maximize the 
likelihood function.
+
+$$
+\min_{\vec{\beta}} \mathcal{L}(\vec{\theta}|\vec{y},X) =
+\prod_{i=1}^{N} \exp{\left(\frac{y_i\theta_i - b(\theta_i)}{\phi/w_i} - 
c(y_i, \phi)\right)}
+$$
+
+where the parameter of interest $\theta_i$ is related to the regression 
coefficients $\vec{\beta}$
+by
+
+$$
+\theta_i = h(g^{-1}(\vec{x_i} \cdot \vec{\beta}))
+$$
+
+Spark's generalized linear regression interface also provides summary 
statistics for diagnosing the
+fit of GLM models, including residuals, p-values, deviances, the Akaike 
information criterion, and
+others.
+
+###  Available families
+
+
+  
+
+  
+  PDF
+  Response Type
+  Supported Links
+  
+  
+
+  Gaussian
+  $\frac{1}{\sigma \sqrt{2\pi}} \exp \left( -\frac{(x - 
\mu)^2}{2\sigma^2}\right)$
+  Continuous
+  Identity*, Log, Inverse
+
+
+  Binomial
+  $\binom{n}{k}p^k (1-p)^{n-k}$
+  Binary
+  Logit*, Probit, CLogLog
+
+
+  Poisson
+  $\frac{\lambda^k e^{-\lambda}}{k!}$
+  Count
+  Log*, Identity, Sqrt
+
+
+  Gamma
+  $\frac{\beta^{\alpha}}{\Gamma(\alpha)} x^{\alpha - 1} e^{-\beta 
x}$
+  Continuous
+  Inverse*, Idenity, Log
+
+* Canonical Link
+  
+
+
+### Optimization
--- End diff --

So, I went ahead and added some more detail on the optimization routine. I 
made an effort to stress the limitations on numFeatures and to give some 
explanation as to why. Could you take a look at it? I didn't generate the docs 
to make sure it looks alright just yet, but I wanted to get that up so it could 
be reviewed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13186#issuecomment-220228251
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13186#issuecomment-220228252
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58835/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13186#issuecomment-220228173
  
**[Test build #58835 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58835/consoleFull)**
 for PR 13186 at commit 
[`23b43d4`](https://github.com/apache/spark/commit/23b43d4c837d762461dd56a62b85cb998919e0ef).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15390] fix broadcast with 100 millions ...

2016-05-18 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13182#discussion_r63822575
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
 ---
@@ -410,9 +410,10 @@ private[execution] final class LongToUnsafeRowMap(val 
mm: TaskMemoryManager, cap
 
   private def init(): Unit = {
 if (mm != null) {
+  require(capacity < (512 << 20), "Cannot broadcast more than 512 
millions rows")
--- End diff --

Looks like it is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15390] fix broadcast with 100 millions ...

2016-05-18 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/13182#discussion_r63822450
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
 ---
@@ -410,9 +410,10 @@ private[execution] final class LongToUnsafeRowMap(val 
mm: TaskMemoryManager, cap
 
   private def init(): Unit = {
 if (mm != null) {
+  require(capacity < (512 << 20), "Cannot broadcast more than 512 
millions rows")
--- End diff --

yes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15390] fix broadcast with 100 millions ...

2016-05-18 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13182#discussion_r63822349
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
 ---
@@ -410,9 +410,10 @@ private[execution] final class LongToUnsafeRowMap(val 
mm: TaskMemoryManager, cap
 
   private def init(): Unit = {
 if (mm != null) {
+  require(capacity < (512 << 20), "Cannot broadcast more than 512 
millions rows")
--- End diff --

Is `capacity` number of row?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15381][SQL] physical object operator sh...

2016-05-18 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13167


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15381][SQL] physical object operator sh...

2016-05-18 Thread davies

Github user davies commented on the pull request:

https://github.com/apache/spark/pull/13167#issuecomment-220226195
  
Merging this into master and 2.0, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...

2016-05-18 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/10061#discussion_r63822163
  
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
 executorMetricsUpdateToJson(metricsUpdate)
   case blockUpdated: SparkListenerBlockUpdated =>
 throw new MatchError(blockUpdated)  // TODO(ekl) implement this
+  case _ => parse(mapper.writeValueAsString(event))
--- End diff --

> Events are a public API, and they should be carefully crafted, since 
changing them affects user applications (including event logs). If there is 
unnecessary information in the event, then it's a bug in the event definition, 
not here.

Yea. I totally agree. However, my concern is that having this line at here 
will make the developer harder to spot issues during the development. Since the 
serialization works automatically, we are not making a self-review on what will 
be serialized and what methods will be called during serialization a mandatory 
step, which makes the auditing work much harder. Although it introduces more 
work to the developer to make every event explicitly handled, when we review 
the pull request, we can clearly know what will be serialized and how a event 
is serialized when a pull request is submitted. What do you think?

btw, if I am missing any context, please let me know :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15381][SQL] physical object operator sh...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13167#issuecomment-220225651
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58836/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15381][SQL] physical object operator sh...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13167#issuecomment-220225648
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220225586
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220225588
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58837/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15381][SQL] physical object operator sh...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13167#issuecomment-220225530
  
**[Test build #58836 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58836/consoleFull)**
 for PR 13167 at commit 
[`a97e358`](https://github.com/apache/spark/commit/a97e3586b7b856d5a62981ff459f48da8d1128bb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220225490
  
**[Test build #58837 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58837/consoleFull)**
 for PR 12719 at commit 
[`0cb1136`](https://github.com/apache/spark/commit/0cb11361ff70d88ae09a4fd31154999fc9c3efae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8603][SPARKR] Incorrect file separator ...

2016-05-18 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/13165#issuecomment-220224055
  
@sun-rui @felixcheung Let me try to build and run all tests for R first in 
Windows and then will try to correct and add each test one by one. This will 
take a bit of time and I might have to ask a lot of questions but anyway I will 
try.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

2016-05-18 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13156#discussion_r63820600
  
--- Diff: 
sql/hivecontext-compatibility/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
 ---
@@ -58,4 +58,16 @@ class HiveContext private[hive](
 sparkSession.sharedState.asInstanceOf[HiveSharedState]
   }
 
+  /**
+   * Invalidate and refresh all the cached the metadata of the given 
table. For performance reasons,
+   * Spark SQL or the external data source library it uses might cache 
certain metadata about a
+   * table, such as the location of blocks. When those change outside of 
Spark SQL, users should
+   * call this function to invalidate the cache.
+   *
+   * @since 1.3.0
+   */
+  def refreshTable(tableName: String): Unit = {
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13135#issuecomment-220223044
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58838/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13135#issuecomment-220223043
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13135#issuecomment-220222980
  
**[Test build #58838 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58838/consoleFull)**
 for PR 13135 at commit 
[`9ec58e6`](https://github.com/apache/spark/commit/9ec58e6368d848b90b94145a1bb1354587898d82).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Revert "[SPARK-10216][SQL] Avoid creating empt...

2016-05-18 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/13181#issuecomment-220222603
  
Hi @marmbrus , it seems okay!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15322][SQL][FOLLOW-UP] Update deprecate...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13187#issuecomment-220222494
  
**[Test build #58840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58840/consoleFull)**
 for PR 13187 at commit 
[`9b07d09`](https://github.com/apache/spark/commit/9b07d09301e9c6695e3586e06852f679594d988d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220222493
  
**[Test build #58841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58841/consoleFull)**
 for PR 13122 at commit 
[`84aa14a`](https://github.com/apache/spark/commit/84aa14a5deda14083520e8e23f83cdb7f5bbb2bc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

2016-05-18 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13156#discussion_r63820108
  
--- Diff: 
sql/hivecontext-compatibility/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
 ---
@@ -58,4 +58,16 @@ class HiveContext private[hive](
 sparkSession.sharedState.asInstanceOf[HiveSharedState]
   }
 
+  /**
+   * Invalidate and refresh all the cached the metadata of the given 
table. For performance reasons,
+   * Spark SQL or the external data source library it uses might cache 
certain metadata about a
+   * table, such as the location of blocks. When those change outside of 
Spark SQL, users should
+   * call this function to invalidate the cache.
+   *
+   * @since 1.3.0
+   */
+  def refreshTable(tableName: String): Unit = {
--- End diff --

This class is for the compatibility purpose. Let's leave it as is. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15322][SQL][FOLLOW-UP] Update deprecate...

2016-05-18 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/13187

[SPARK-15322][SQL][FOLLOW-UP] Update deprecated accumulator usage into 
accumulatorV2

## What changes were proposed in this pull request?

This PR corrects another case that uses deprecated `accumulableCollection` 
to use `listAccumulator`, which seems the previous PR missed.

Since `ArrayBuffer[InternalRow]` is `java.util.List[InternalRow]`, it seems 
reasonable to replace the usage.

## How was this patch tested?

Related existing tests `InMemoryColumnarQuerySuite` and `CachedTableSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-15322

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13187.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13187


commit 9b07d09301e9c6695e3586e06852f679594d988d
Author: hyukjinkwon 
Date:   2016-05-19T03:50:37Z

Use list accumulator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13135#issuecomment-220222031
  
**[Test build #58838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58838/consoleFull)**
 for PR 13135 at commit 
[`9ec58e6`](https://github.com/apache/spark/commit/9ec58e6368d848b90b94145a1bb1354587898d82).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13122#issuecomment-220222027
  
**[Test build #58839 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58839/consoleFull)**
 for PR 13122 at commit 
[`0702178`](https://github.com/apache/spark/commit/0702178a3c485aa316d5b03b3aefb2ea4a228cc2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15331] [SQL] Disallow All the Unsupport...

2016-05-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13122#discussion_r63819835
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 ---
@@ -234,6 +234,13 @@ class CliSuite extends SparkFunSuite with 
BeforeAndAfterAll with Logging {
 )
   }
 
+  test("unsupported operations") {
--- End diff --

@hvanhovell The latest changes added the test cases for the unsupported 
operations. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15130][PySpark][ML][DOCS] pyspark expos...

2016-05-18 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/12914#issuecomment-220219840

@jkbradley @yanboliang @holdenk @sethah let's discuss the issue of defaults
in param doc (refer
https://github.com/apache/spark/pull/13148#discussion_r63600571) on this PR
since it is pertinent.

Here, Holden raises 2 issues:
1. The Scaladoc contains default values for many params (sometimes in
shared traits). In addition the Scala `Param` itself has the self-contained
`doc` field (typically not containing defaults, since the built-in doc shows
current and default in `explainParam`).
2. The PyDoc only contains the `Param` `doc` field.

(By the way, (1) implies that in cases where the default param value in the
trait is overridden, the Scaladoc is incorrect, but that is another issue).

The result of (2) is that the HTML API doc doesn't look great, e.g.

https://cloud.githubusercontent.com/assets/1036807/15381231/0a937dde-1d7e-11e6-885c-b120679f84ee.png";>

Also, nowhere in the PyDoc are the defaults listed, while in the Scaladoc
they are.

I agree that it would be nice to have the defaults listed in the PyDoc in
some way.
1. One solution is the original approach here, where defaults are put in
the Param doc in a standard way, but stripped out during `explainParams`. This
works but IMO is more prone to breaking in future if people forget to do things
in exactly the correct format. It also doesn't directly solve the problem of
the API doc looking ugly;
2. Another solution is the current approach here, where the attributes are
turned into properties with a docstring (possibly including the default) - this
does solve the problem of nice display in the API doc. The downside here is the
potentially fairly large change to make everything a property, and the code
duplication introduced (though kept to a minimum) and extra boilerplate when
adding new params that could be more error-prone;
3. A third solution is what I've done
[here](https://github.com/mlnick/spark/tree/sphinx-doc-params) as a PoC, which
basically adds the built-in doc as the instance docstring for each Python
`Param`. Then we override the `AttributeDocumenter` in Sphinx to handle it. The
result displays nicely in the API doc (the same as the property approach, but
no defaults are added). The other thing that changes is the `__init__`
docstring is brought back (for some reason the current docs are not showing
that), which means that the defaults are essentially documented there for each
class. In a way this seems more "Pythonic" to me (i.e. Python users are
accustomed to seeing the default arg values in constructer doc, e.g.
sciki-learn).
4. Another option is to do nothing (for now at least), except bring back
the `__init__` docstring. This keeps the ugly-looking `Param` doc, but at least
shows the default args for each class, and is the current behavior. We can do
something like (1) or (3) later (but maybe not (2) during Spark 2.x as it may
be too large a change).
5. A final option is to perhaps document defaults elsewhere (such as the
setter for the param which is usually implemented in the class or a model trait
in Scala).

Let's decide on an approach and make it consistent across the board.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15395][Core]Use getHostString to create...

2016-05-18 Thread zzcclp

Github user zzcclp commented on the pull request:

https://github.com/apache/spark/pull/13185#issuecomment-220218127
  
ï¼ zsxwing will this pr be merged into branch 1.6?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15395][Core]Use getHostString to create...

2016-05-18 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13185


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15395][Core]Use getHostString to create...

2016-05-18 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/13185#issuecomment-220217816
  
Didn't merge to 1.6 due to the conflicts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15395][Core]Use getHostString to create...

2016-05-18 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/13185#issuecomment-220217602
  
Thanks. Merging to master, 2.0 and 1.6.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220217482
  
Oh, amazing. According to the last Jenkins results. The seven test failures 
in `catalyst` are all of them.
```
[info] *** 7 TESTS FAILED ***
[error] Failed: Total 1656, Failed 7, Errors 0, Passed 1649, Ignored 1
[error] Failed tests:
[error] org.apache.spark.sql.catalyst.expressions.DateExpressionsSuite
[error] org.apache.spark.sql.catalyst.expressions.CastSuite
[error] (catalyst/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 222 s, completed May 18, 2016 8:11:07 PM
```
Anyway, I will handle them in another PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220217398
  
**[Test build #58837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58837/consoleFull)**
 for PR 12719 at commit 
[`0cb1136`](https://github.com/apache/spark/commit/0cb11361ff70d88ae09a4fd31154999fc9c3efae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13186#issuecomment-220217381
  
**[Test build #58835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58835/consoleFull)**
 for PR 13186 at commit 
[`23b43d4`](https://github.com/apache/spark/commit/23b43d4c837d762461dd56a62b85cb998919e0ef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15381][SQL] physical object operator sh...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13167#issuecomment-220217395
  
**[Test build #58836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58836/consoleFull)**
 for PR 13167 at commit 
[`a97e358`](https://github.com/apache/spark/commit/a97e3586b7b856d5a62981ff459f48da8d1128bb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15367] [SQL] Add refreshTable back

2016-05-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13156#discussion_r63817417
  
--- Diff: 
sql/hivecontext-compatibility/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
 ---
@@ -58,4 +58,16 @@ class HiveContext private[hive](
 sparkSession.sharedState.asInstanceOf[HiveSharedState]
   }
 
+  /**
+   * Invalidate and refresh all the cached the metadata of the given 
table. For performance reasons,
+   * Spark SQL or the external data source library it uses might cache 
certain metadata about a
+   * table, such as the location of blocks. When those change outside of 
Spark SQL, users should
+   * call this function to invalidate the cache.
+   *
+   * @since 1.3.0
+   */
+  def refreshTable(tableName: String): Unit = {
--- End diff --

if `invalidateTable` has different meaning than `refreshTable`, should we 
also add it to `HiveContext`? cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220217295
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58834/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220217294
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220217222
  
**[Test build #58834 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58834/consoleFull)**
 for PR 12719 at commit 
[`d8257ee`](https://github.com/apache/spark/commit/d8257eef75433fe25aa4fd9c8c387933f23cfd20).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...

Github user dongjoon-hyun commented on the pull request:

https://github.com/apache/spark/pull/12719#issuecomment-220217246
  
I removed the last test commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15397] [SQL] fix string udf locate as h...

2016-05-18 Thread adrian-wang

GitHub user adrian-wang opened a pull request:

https://github.com/apache/spark/pull/13186

[SPARK-15397] [SQL] fix string udf locate as hive

## What changes were proposed in this pull request?

in hive, `locate("aa", "aaa", 0)` would yield 0, `locate("aa", "aaa", 1)` 
would yield 1 and `locate("aa", "aaa", 2)` would yield 2, while in Spark, 
`locate("aa", "aaa", 0)` would yield 1,  `locate("aa", "aaa", 1)` would yield 2 
and  `locate("aa", "aaa", 2)` would yield 0. This results from the different 
understanding of the third parameter in udf `locate`. It means the starting 
index and starts from 1, so when we use 0, the return would always be 0.


## How was this patch tested?

tested with modified `StringExpressionsSuite` and `StringFunctionsSuite`




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adrian-wang/spark locate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13186.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13186


commit 23b43d4c837d762461dd56a62b85cb998919e0ef
Author: Daoyuan Wang 
Date:   2016-05-18T11:30:07Z

fix string udf locate as hive




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...