[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18185
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18181
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18184
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18184
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18185
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77671/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18184
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77670/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18184: [MINOR] [SQL] Update the description of spark.sql.files....

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18184
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77668/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18181
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77669/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18185
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18185
  
**[Test build #77672 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77672/testReport)**
 for PR 18185 at commit 
[`7d9d4ca`](https://github.com/apache/spark/commit/7d9d4ca38e81d5b0a6cfae8d02d7f440eada4380).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18130: [Web UI] Remove no need loop in JobProgressListener

2017-06-02 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/18130
  
There's a [JIRA](https://issues.apache.org/jira/browse/SPARK-20650) 
planning to remove this `JobProgressListener`, so I'd suggest to not change 
this deprecated code unnecessarily.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119791032
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

This should be `None` or an argument `subset` of `fillna()` above should be 
`['name', 'spy']`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18183#discussion_r119796465
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ParquetDictionary.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
--- End diff --

Should this move to `org.apache.spark.sql.execution.vectorized.parquet` 
package?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread liancheng
Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/18181
  
Unfortunately, rolling back parquet-mr to 1.8.1 brings back 
[PARQUET-389][1], which breaks multiple test cases involving schema evolution 
(add a new column to a Parquet table and filter on that column).

Trying to figure out a workaround for this but haven't got any luck yet.

[1]: https://issues.apache.org/jira/browse/PARQUET-389


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread liancheng
Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/18181
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18181
  
**[Test build #77673 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77673/testReport)**
 for PR 18181 at commit 
[`c956201`](https://github.com/apache/spark/commit/c95620133c523f2641fce1718ea25b16b51c196d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119798778
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

Hi @ueshin indeed! Thanks for catching this, I have modified the test. BUT, 
this test, as it stands on your comment, should have failed, doesn't it? The 
subset should not have been applied to spy (so, spy should have been None, and 
the assertion should have been marked as false, but either the test passed or 
the test didn't run), if I understood correctly how subsetting fillna's work. 
But this is weird, since I didn't change any internals of how it works, I just 
created the methods to enable it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119800139
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

Well, I think this fails :

```
==
ERROR [0.452s]: test_fillna (pyspark.sql.tests.SQLTests)
--
Traceback (most recent call last):
  File ".../spark/python/pyspark/sql/tests.py", line 1749, in test_fillna
self.assertEqual(row.spy, True)
AssertionError: None != True
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119800727
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

@rberenguel I'm sorry but I didn't understand what you are getting at.
I guess if the subset is `['name', 'spy']` as you updated, `row.spy` will 
become `True` because the `row.spy` is `BooleanType` and the value is boolean.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18164
  
@ueshin, do you think it is okay to add this? I want to help review here if 
so.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119804978
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

@HyukjinKwon I passed the test in my local environment after I updated to 
the latest commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119805293
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

Yea, I meant your initial comment was right ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119805698
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1697,40 +1697,56 @@ def test_fillna(self):
 schema = StructType([
 StructField("name", StringType(), True),
 StructField("age", IntegerType(), True),
-StructField("height", DoubleType(), True)])
+StructField("height", DoubleType(), True),
+StructField("spy", BooleanType(), True)])
 
 # fillna shouldn't change non-null values
-row = self.spark.createDataFrame([(u'Alice', 10, 80.1)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', 10, 80.1, True)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 10)
 
 # fillna with int
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.0)
 
 # fillna with double
-row = self.spark.createDataFrame([(u'Alice', None, None)], 
schema).fillna(50.1).first()
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(50.1).first()
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, 50.1)
 
+# fillna with bool
+row = self.spark.createDataFrame([(u'Alice', None, None, None)], 
schema).fillna(True).first()
+self.assertEqual(row.age, None)
+self.assertEqual(row.spy, True)
+
 # fillna with string
-row = self.spark.createDataFrame([(None, None, None)], 
schema).fillna("hello").first()
+row = self.spark.createDataFrame([(None, None, None, None)], 
schema).fillna("hello").first()
 self.assertEqual(row.name, u"hello")
 self.assertEqual(row.age, None)
 
 # fillna with subset specified for numeric cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna(50, subset=['name', 
'age']).first()
 self.assertEqual(row.name, None)
 self.assertEqual(row.age, 50)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
 
-# fillna with subset specified for numeric cols
+# fillna with subset specified for string cols
 row = self.spark.createDataFrame(
-[(None, None, None)], schema).fillna("haha", subset=['name', 
'age']).first()
+[(None, None, None, None)], schema).fillna("haha", 
subset=['name', 'age']).first()
 self.assertEqual(row.name, "haha")
 self.assertEqual(row.age, None)
 self.assertEqual(row.height, None)
+self.assertEqual(row.spy, None)
+
+# fillna with subset specified for bool cols
+row = self.spark.createDataFrame(
+[(None, None, None, None)], schema).fillna(True, 
subset=['name', 'age']).first()
+self.assertEqual(row.name, None)
+self.assertEqual(row.age, None)
+self.assertEqual(row.height, None)
+self.assertEqual(row.spy, True)
--- End diff --

Ah, I see. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18164
  
@HyukjinKwon Yes, I think it's okay to add this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18183#discussion_r119807101
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/Dictionary.java
 ---
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
+
+public interface Dictionary {
--- End diff --

Do we need some JavaDoc since it is public?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18183#discussion_r119807273
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ParquetDictionary.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18183: [SPARK-20961][SQL] generalize the dictionary in C...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18183#discussion_r119807397
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ParquetDictionary.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
+
+public final class ParquetDictionary implements Dictionary {
+  private org.apache.parquet.column.Dictionary dictionary;
--- End diff --

Is it better to declare `import org.apache.parquet.column.Dictionary`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep U...

2017-06-02 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18014#discussion_r119808220
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
 ---
@@ -386,6 +425,35 @@ public void putArray(int rowId, int offset, int 
length) {
   }
 
   @Override
+  public void putArray(int rowId, Object src, int srcOffset, int 
dstOffset, int numElements) {
+DataType et = type;
+reserve(dstOffset + numElements);
+if (et == DataTypes.BooleanType || et == DataTypes.ByteType) {
+  Platform.copyMemory(
+src, srcOffset, byteData, Platform.BYTE_ARRAY_OFFSET + dstOffset, 
numElements);
+} else if (et == DataTypes.BooleanType || et == DataTypes.ByteType) {
+  Platform.copyMemory(
+src, srcOffset, shortData, Platform.SHORT_ARRAY_OFFSET + dstOffset 
* 2, numElements * 2);
+} else if (et == DataTypes.IntegerType || et == DataTypes.DateType ||
+  DecimalType.is32BitDecimalType(type)) {
+  Platform.copyMemory(
+src, srcOffset, intData, Platform.INT_ARRAY_OFFSET + dstOffset * 
4, numElements * 4);
+} else if (type instanceof LongType || type instanceof TimestampType ||
+  DecimalType.is64BitDecimalType(type)) {
+  Platform.copyMemory(
+src, srcOffset, longData, Platform.LONG_ARRAY_OFFSET + dstOffset * 
8, numElements * 8);
+} else if (et == DataTypes.FloatType) {
+  Platform.copyMemory(
+src, srcOffset, floatData, Platform.FLOAT_ARRAY_OFFSET + dstOffset 
* 4, numElements * 4);
+} else if (et == DataTypes.DoubleType) {
+  Platform.copyMemory(
+src, srcOffset, doubleData, Platform.DOUBLE_ARRAY_OFFSET + 
dstOffset * 8, numElements * 8);
+} else {
+  throw new RuntimeException("Unhandled " + type);
--- End diff --

Let me think about this over weekend since I am busy to prepare a slide for 
SparkSummit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18164
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18164
  
**[Test build #77674 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77674/testReport)**
 for PR 18164 at commit 
[`1b3c712`](https://github.com/apache/spark/commit/1b3c7126f827b23db585bc4e5a4cecef854320c4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18164
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18164
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77674/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18164
  
**[Test build #77674 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77674/testReport)**
 for PR 18164 at commit 
[`1b3c712`](https://github.com/apache/spark/commit/1b3c7126f827b23db585bc4e5a4cecef854320c4).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18186: [SPARK-20966]Table data is not sorted by startTim...

2017-06-02 Thread guoxiaolongzte
GitHub user guoxiaolongzte opened a pull request:

https://github.com/apache/spark/pull/18186

[SPARK-20966]Table data is not sorted by startTime time desc, time is not 
formatted and redundant code in JDBC/ODBC Server page.

## What changes were proposed in this pull request?

1. Question 1 : Table data is not sorted by startTime time desc in 
JDBC/ODBC Server page.

fix before :

![2](https://cloud.githubusercontent.com/assets/26266482/26718483/bf4a0fa8-47b3-11e7-9a27-dc6a67165b16.png)


fix after :

![21](https://cloud.githubusercontent.com/assets/26266482/26718544/eb7376c8-47b3-11e7-9117-1bc68dfec92c.png)


2. Question 2 :  time is not formatted in JDBC/ODBC Server page.

fix before : 

![1](https://cloud.githubusercontent.com/assets/26266482/26718573/0497d86a-47b4-11e7-945b-582aaa103949.png)

fix after : 

![11](https://cloud.githubusercontent.com/assets/26266482/26718602/21371ad0-47b4-11e7-9587-c5114d10ab2c.png)

3. Question 3 :  Redundant code in the ThriftServerSessionPage.scala.
The function of 'generateSessionStatsTable'  has not been used

## How was this patch tested?

manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guoxiaolongzte/spark SPARK-20966

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18186.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18186


commit d383efba12c66addb17006dea107bb0421d50bc3
Author: 郭小龙 10207633 
Date:   2017-03-31T13:57:09Z

[SPARK-20177]Document about compression way has some little detail changes.

commit 3059013e9d2aec76def14eb314b6761bea0e7ca0
Author: 郭小龙 10207633 
Date:   2017-04-01T01:38:02Z

[SPARK-20177] event log add a space

commit 555cef88fe09134ac98fd0ad056121c7df2539aa
Author: guoxiaolongzte 
Date:   2017-04-02T00:16:08Z

'/applications/[app-id]/jobs' in rest api,status should be 
[running|succeeded|failed|unknown]

commit 46bb1ad3ddd9fb55b5607ac4f20213a90186cfe9
Author: 郭小龙 10207633 
Date:   2017-04-05T03:16:50Z

Merge branch 'master' of https://github.com/apache/spark into SPARK-20177

commit 0efb0dd9e404229cce638fe3fb0c966276784df7
Author: 郭小龙 10207633 
Date:   2017-04-05T03:47:53Z

[SPARK-20218]'/applications/[app-id]/stages' in REST API,add description.

commit 0e37fdeee28e31fc97436dabd001d3c85c5a7794
Author: 郭小龙 10207633 
Date:   2017-04-05T05:22:54Z

[SPARK-20218] '/applications/[app-id]/stages/[stage-id]' in REST API,remove 
redundant description.

commit 52641bb01e55b48bd9e8579fea217439d14c7dc7
Author: 郭小龙 10207633 
Date:   2017-04-07T06:24:58Z

Merge branch 'SPARK-20218'

commit d3977c9cab0722d279e3fae7aacbd4eb944c22f6
Author: 郭小龙 10207633 
Date:   2017-04-08T07:13:02Z

Merge branch 'master' of https://github.com/apache/spark

commit 137b90e5a85cde7e9b904b3e5ea0bb52518c4716
Author: 郭小龙 10207633 
Date:   2017-04-10T05:13:40Z

Merge branch 'master' of https://github.com/apache/spark

commit 0fe5865b8022aeacdb2d194699b990d8467f7a0a
Author: 郭小龙 10207633 
Date:   2017-04-10T10:25:22Z

Merge branch 'SPARK-20190' of https://github.com/guoxiaolongzte/spark

commit cf6f42ac84466960f2232c025b8faeb5d7378fe1
Author: 郭小龙 10207633 
Date:   2017-04-10T10:26:27Z

Merge branch 'master' of https://github.com/apache/spark

commit 685cd6b6e3799c7be65674b2670159ba725f0b8f
Author: 郭小龙 10207633 
Date:   2017-04-14T01:12:41Z

Merge branch 'master' of https://github.com/apache/spark

commit c716a9231e9ab117d2b03ba67a1c8903d8d9da93
Author: guoxiaolong 
Date:   2017-04-17T06:57:21Z

Merge branch 'master' of https://github.com/apache/spark

commit 679cec36a968fbf995b567ca5f6f8cbd8e32673f
Author: guoxiaolong 
Date:   2017-04-19T07:20:08Z

Merge branch 'master' of https://github.com/apache/spark

commit 3c9387af84a8f39cf8c1ce19e15de99dfcaf0ca5
Author: guoxiaolong 
Date:   2017-04-19T08:15:26Z

Merge branch 'master' of https://github.com/apache/spark

commit cb71f4462a0889cbb0843875b1e4cf14bcb0d020
Author: guoxiaolong 
Date:   2017-04-20T05:52:06Z

Merge branch 'master' of https://github.com/apache/spark

commit ce92a7415a2026f5bf909820110a13750a0949e1
Author: guoxiaolong 
Date:   2017-04-21T05:21:48Z

Merge branch 'master' of https://github.com/apache/spark

commit dd64342206041a8c3a282459e5f2b898dc558d89
Author: guoxiaolong 
Date:   2017-04-21T08:44:25Z

Merge branch 'master' of https://github.com/apache/spark

commit bffd2bd00c6b0e20313756e133adca4c97707c67
Author: guoxiaolong 
Date:   2017-04-28T01:36:29Z

Merge branch 'master' of https://github.com/apache/spark

commit 588d42a382345a07153

[GitHub] spark issue #18186: [SPARK-20966]Table data is not sorted by startTime time ...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18186
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18186: [SPARK-20966]Table data is not sorted by startTime time ...

2017-06-02 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/18186
  
@srowen @ajbozarth @jerryshao Help to review the code, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119810057
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala ---
@@ -196,6 +196,24 @@ final class DataFrameNaFunctions private[sql](df: 
DataFrame) {
   def fill(value: String, cols: Seq[String]): DataFrame = fillValue(value, 
cols)
 
   /**
+   * Returns a new `DataFrame` that replaces null values in boolean 
columns with `value`.
+   */
+  def fill(value: Boolean): DataFrame = fill(value, df.columns)
--- End diff --

Looks we need `@since 2.3.0` for this and the same instances below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119808932
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1303,9 +1312,11 @@ def fillna(self, value, subset=None):
 +---+--+---+
 """
 if not isinstance(value, (float, int, long, basestring, dict)):
-raise ValueError("value should be a float, int, long, string, 
or dict")
+raise ValueError("value should be a float, int, long, string, 
boolean or dict")
 
-if isinstance(value, (int, long)):
+if isinstance(value, bool):
+pass
+elif isinstance(value, (int, long)):
--- End diff --

Could we just make this `not isinstance(value, bool) and isinstance(value, 
(int, long))` (maybe with a small comment)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119811145
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameNaFunctionsSuite.scala ---
@@ -124,6 +134,13 @@ class DataFrameNaFunctionsSuite extends QueryTest with 
SharedSQLContext {
 Row("Nina") :: Row("Amy") :: Row("unknown") :: Nil)
 assert(input.na.fill("unknown").columns.toSeq === input.columns.toSeq)
 
+// boolean
+checkAnswer(
+  boolInput.na.fill(true).select("spy"),
+  Row(false) :: Row(true) :: Row(true) ::
--- End diff --

I think we could make this inlined.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119808351
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1303,9 +1312,11 @@ def fillna(self, value, subset=None):
 +---+--+---+
 """
 if not isinstance(value, (float, int, long, basestring, dict)):
--- End diff --

I know a bool in Python inherits an int but wouldn't it be more clear if we 
explicitly mention it here? I don't strongly feel about this.

BTW, this rings a bell - some Python APIs take a `bool` in this way and 
work unexpectedly in some cases IIRC ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119810367
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala ---
@@ -196,6 +196,24 @@ final class DataFrameNaFunctions private[sql](df: 
DataFrame) {
   def fill(value: String, cols: Seq[String]): DataFrame = fillValue(value, 
cols)
 
   /**
+   * Returns a new `DataFrame` that replaces null values in boolean 
columns with `value`.
+   */
+  def fill(value: Boolean): DataFrame = fill(value, df.columns)
+
+  /**
+   * (Scala-specific) Returns a new `DataFrame` that replaces null or NaN 
values in specified
--- End diff --

I think a boolean column could not have "NaN values".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119810649
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameNaFunctionsSuite.scala ---
@@ -36,6 +36,15 @@ class DataFrameNaFunctionsSuite extends QueryTest with 
SharedSQLContext {
   ).toDF("name", "age", "height")
   }
 
+  def createBooleanDF(): DataFrame = {
--- End diff --

It looks this functions is only used once. I think we could just move the 
lines in the functions into the test, "fill".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119808075
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1303,9 +1312,11 @@ def fillna(self, value, subset=None):
 +---+--+---+
 """
 if not isinstance(value, (float, int, long, basestring, dict)):
-raise ValueError("value should be a float, int, long, string, 
or dict")
+raise ValueError("value should be a float, int, long, string, 
boolean or dict")
--- End diff --

I think we should use the same term, `bool` or `boolean`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119812409
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1303,9 +1312,11 @@ def fillna(self, value, subset=None):
 +---+--+---+
 """
 if not isinstance(value, (float, int, long, basestring, dict)):
-raise ValueError("value should be a float, int, long, string, 
or dict")
+raise ValueError("value should be a float, int, long, string, 
boolean or dict")
 
-if isinstance(value, (int, long)):
+if isinstance(value, bool):
+pass
+elif isinstance(value, (int, long)):
--- End diff --

Thanks, indeed makes sense and makes it a bit nicer than having a pass. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18164
  
**[Test build #77675 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77675/testReport)**
 for PR 18164 at commit 
[`21b4f67`](https://github.com/apache/spark/commit/21b4f677fe3c521005bbc9b95877dc9d093fbe40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119812590
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1303,9 +1312,11 @@ def fillna(self, value, subset=None):
 +---+--+---+
 """
 if not isinstance(value, (float, int, long, basestring, dict)):
--- End diff --

I omitted it just because it wasn't failing for this if, but indeed, I'm a 
bit more on the side of putting it in even if just for completeness. Makes 
reading the code much saner if we have the if for bool


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119812631
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1303,9 +1312,11 @@ def fillna(self, value, subset=None):
 +---+--+---+
 """
 if not isinstance(value, (float, int, long, basestring, dict)):
-raise ValueError("value should be a float, int, long, string, 
or dict")
+raise ValueError("value should be a float, int, long, string, 
boolean or dict")
--- End diff --

Thanks, will change


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119812666
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala ---
@@ -196,6 +196,24 @@ final class DataFrameNaFunctions private[sql](df: 
DataFrame) {
   def fill(value: String, cols: Seq[String]): DataFrame = fillValue(value, 
cols)
 
   /**
+   * Returns a new `DataFrame` that replaces null values in boolean 
columns with `value`.
+   */
+  def fill(value: Boolean): DataFrame = fill(value, df.columns)
--- End diff --

I wasn't sure about this, wanted to ask actually. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119812842
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala ---
@@ -196,6 +196,24 @@ final class DataFrameNaFunctions private[sql](df: 
DataFrame) {
   def fill(value: String, cols: Seq[String]): DataFrame = fillValue(value, 
cols)
 
   /**
+   * Returns a new `DataFrame` that replaces null values in boolean 
columns with `value`.
+   */
+  def fill(value: Boolean): DataFrame = fill(value, df.columns)
+
+  /**
+   * (Scala-specific) Returns a new `DataFrame` that replaces null or NaN 
values in specified
--- End diff --

Oh, right. I copied the defs and docs from double, as it shows. Will 
change, NaN booleans would be weird indeed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119813043
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameNaFunctionsSuite.scala ---
@@ -124,6 +134,13 @@ class DataFrameNaFunctionsSuite extends QueryTest with 
SharedSQLContext {
 Row("Nina") :: Row("Amy") :: Row("unknown") :: Nil)
 assert(input.na.fill("unknown").columns.toSeq === input.columns.toSeq)
 
+// boolean
+checkAnswer(
+  boolInput.na.fill(true).select("spy"),
+  Row(false) :: Row(true) :: Row(true) ::
--- End diff --

Sorry, what do you mean by inlined here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119812943
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameNaFunctionsSuite.scala ---
@@ -36,6 +36,15 @@ class DataFrameNaFunctionsSuite extends QueryTest with 
SharedSQLContext {
   ).toDF("name", "age", "height")
   }
 
+  def createBooleanDF(): DataFrame = {
--- End diff --

Yup, right. I added it on top to keep both together, but it's only used for 
the boolean tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread rberenguel
Github user rberenguel commented on the issue:

https://github.com/apache/spark/pull/18164
  
@ueshin @HyukjinKwon thanks for giving it a very thorough look and sorry 
for my previous comment, that was terribly unclear. I was confused because the 
Appveyor tick mark was green for commit 076ebed and I  had run the tests 
locally (forgot linting, though), so I was pretty sure the test was right and I 
was confused about how the subset wrong still had a passing test. 

I probably skipped the wrong step for testing the Python tests (I'm still 
figuring out which corners I can cut to avoid a full compile/build cycle for 
the whole project, which takes ages for me) so I didn't see my local test 
failing, but the remote one was more puzzling, I guess appveyor had a hiccup 
here. Sorry again for the confused and confusing statements above :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18164#discussion_r119813861
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameNaFunctionsSuite.scala ---
@@ -124,6 +134,13 @@ class DataFrameNaFunctionsSuite extends QueryTest with 
SharedSQLContext {
 Row("Nina") :: Row("Amy") :: Row("unknown") :: Nil)
 assert(input.na.fill("unknown").columns.toSeq === input.columns.toSeq)
 
+// boolean
+checkAnswer(
+  boolInput.na.fill(true).select("spy"),
+  Row(false) :: Row(true) :: Row(true) ::
--- End diff --

Ah, I meant ...

```
Row(false) :: Row(true) :: Row(true) :: Row(true) :: Nil
```

because it does not look exceeding the length limit, 100 - 
https://github.com/apache/spark/blob/master/scalastyle-config.xml#L78


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18164
  
Ah... I see. Sorry, I misunderstood. BTW, AppVeyor only runs SparkR tests 
on Windows currently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18187: [SPARK-20967][SQL] SharedState.externalCatalog is...

2017-06-02 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/18187

[SPARK-20967][SQL] SharedState.externalCatalog is not really lazy

## What changes were proposed in this pull request?

`SharedState.externalCatalog` is marked as a `lazy val` but actually it's 
not lazy. We access `externalCatalog` while initializing `SharedState` and thus 
eliminate the effort of `lazy val`. When creating `ExternalCatalog` we will try 
to connect to the metastore and may throw an error, so it makes sense to make 
it a `lazy val` in `SharedState`.

## How was this patch tested?

existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18187.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18187


commit 7d7331c987d9b15bd2b55efdc76da9d9e6ac078e
Author: Wenchen Fan 
Date:   2017-06-02T09:21:34Z

SharedState.externalCatalog is not really lazy




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18187: [SPARK-20967][SQL] SharedState.externalCatalog is not re...

2017-06-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18187
  
cc @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18187: [SPARK-20967][SQL] SharedState.externalCatalog is not re...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18187
  
**[Test build #77676 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77676/testReport)**
 for PR 18187 at commit 
[`7d7331c`](https://github.com/apache/spark/commit/7d7331c987d9b15bd2b55efdc76da9d9e6ac078e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18185
  
**[Test build #77672 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77672/testReport)**
 for PR 18185 at commit 
[`7d9d4ca`](https://github.com/apache/spark/commit/7d9d4ca38e81d5b0a6cfae8d02d7f440eada4380).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class UnresolvedRelation(tableIdentifier: TableIdentifier) 
extends LeafNode `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18185
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77672/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18185
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18181
  
**[Test build #77673 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77673/testReport)**
 for PR 18181 at commit 
[`c956201`](https://github.com/apache/spark/commit/c95620133c523f2641fce1718ea25b16b51c196d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18181
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77673/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18181: [SPARK-20958][SQL] Roll back parquet-mr 1.8.2 to 1.8.1

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18181
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] fillna bools

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18164
  
BTW, mind fixng the title/contents in the PR to be a bit more descriptive, 
for example, saying "null" instead of "NA"?  Not a big deal but non R guys 
might get confused ... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] Add fill functions for nulls...

2017-06-02 Thread rberenguel
Github user rberenguel commented on the issue:

https://github.com/apache/spark/pull/18164
  
@HyukjinKwon I changed it, does it look any clearer? I have always thought 
of `na` in terms of Python and not R anyway :) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] Add fill functions for nulls...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18164
  
Aaa...okay that's fine to me. NA always reminds me of R first :). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] Add fill functions for nulls...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18164
  
**[Test build #77675 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77675/testReport)**
 for PR 18164 at commit 
[`21b4f67`](https://github.com/apache/spark/commit/21b4f677fe3c521005bbc9b95877dc9d093fbe40).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] Add fill functions for nulls...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18164
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18164: [SPARK-19732][SQL][PYSPARK] Add fill functions for nulls...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18164
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77675/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18187: [SPARK-20967][SQL] SharedState.externalCatalog is not re...

2017-06-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18187
  
**[Test build #77676 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77676/testReport)**
 for PR 18187 at commit 
[`7d7331c`](https://github.com/apache/spark/commit/7d7331c987d9b15bd2b55efdc76da9d9e6ac078e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18187: [SPARK-20967][SQL] SharedState.externalCatalog is not re...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18187
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77676/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18187: [SPARK-20967][SQL] SharedState.externalCatalog is not re...

2017-06-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18187
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17985: Add "full_outer" name to join types

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17985
  
Ping @BartekH, how is it going?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17937: Reload credentials file config when app starts with chec...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17937
  
ping @victor-wong, how it is going?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17882: [WIP][SPARK-20079][try 2][yarn] Re registration of AM ha...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17882
  
ping @witgo how it is going?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17791: [SPARK-20515][SQL] Fix reading of HIVE ORC table with va...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17791
  
ping @umehrot2 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17759: [DOCS] Fix a typo in Encoder.clsTag

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17759
  
ping @mineo 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17721: [SPARK-20013][SQL]merge renameTable to alterTable in Ext...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17721
  
ping @windpiger 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17716: [SPARK-19952][SQL] Remove various analysis exceptions

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17716
  
@hvanhovell, how is it going?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17689: [SPARK-20378][CORE][SQL][SS] StreamSinkProvider should p...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17689
  
ping @ymahajan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17681: [SPARK-20383][SQL] Supporting Create [temporary] Functio...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17681
  
Hi all, where are we on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17645: [SPARK-20348] [ML] Support squared hinge loss (L2 loss) ...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17645
  
ping @hhbyyh, where are we on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir are not ...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17638
  
Hi all, where are we on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17533
  
Hi @jinxing64, how is it going?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17503: [SPARK-3159][MLlib] Check for reducible DecisionTree

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17503
  
(gentle ping @jkbradley)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

2017-06-02 Thread davideis
Github user davideis commented on a diff in the pull request:

https://github.com/apache/spark/pull/18022#discussion_r119852009
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala ---
@@ -455,6 +487,22 @@ class ALSSuite
   targetRMSE = 0.3)
   }
 
+  test("implicit feedback regression") {
+val trainingWithNeg = sc.parallelize(Array(Rating(0, 0, 1), Rating(1, 
1, 1), Rating(0, 1, -3)))
+val trainingWithZero = sc.parallelize(Array(Rating(0, 0, 1), Rating(1, 
1, 1), Rating(0, 1, 0)))
+val modelWithNeg =
+  trainALS(trainingWithNeg, rank = 1, maxIter = 5, regParam = 0.01, 
implicitPrefs = true)
+val modelWithZero =
+  trainALS(trainingWithZero, rank = 1, maxIter = 5, regParam = 0.01, 
implicitPrefs = true)
+val userFactorsNeg = modelWithNeg.userFactors
+val itemFactorsNeg = modelWithNeg.itemFactors
+val userFactorsZero = modelWithZero.userFactors
+val itemFactorsZero = modelWithZero.itemFactors
+userFactorsNeg.collect().foreach(arr => logInfo(s"implicit test " + 
arr.mkString(" ")))
--- End diff --

Good point, I meant to remove, shall I open another pr?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18022: [SPARK-20790] [MLlib] Correctly handle negative values f...

2017-06-02 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/18022
  
Yeah you can just open a small follow up PR
On Fri, 2 Jun 2017 at 15:10, davideis  wrote:

> *@davideis* commented on this pull request.
> --
>
> In mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala
> :
>
> > @@ -455,6 +487,22 @@ class ALSSuite
>targetRMSE = 0.3)
>}
>
> +  test("implicit feedback regression") {
> +val trainingWithNeg = sc.parallelize(Array(Rating(0, 0, 1), 
Rating(1, 1, 1), Rating(0, 1, -3)))
> +val trainingWithZero = sc.parallelize(Array(Rating(0, 0, 1), 
Rating(1, 1, 1), Rating(0, 1, 0)))
> +val modelWithNeg =
> +  trainALS(trainingWithNeg, rank = 1, maxIter = 5, regParam = 0.01, 
implicitPrefs = true)
> +val modelWithZero =
> +  trainALS(trainingWithZero, rank = 1, maxIter = 5, regParam = 0.01, 
implicitPrefs = true)
> +val userFactorsNeg = modelWithNeg.userFactors
> +val itemFactorsNeg = modelWithNeg.itemFactors
> +val userFactorsZero = modelWithZero.userFactors
> +val itemFactorsZero = modelWithZero.itemFactors
> +userFactorsNeg.collect().foreach(arr => logInfo(s"implicit test " + 
arr.mkString(" ")))
>
> Good point, I meant to remove, shall I open another pr?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18022: [SPARK-20790] [MLlib] Correctly handle negative values f...

2017-06-02 Thread davideis
Github user davideis commented on the issue:

https://github.com/apache/spark/pull/18022
  
Does it need another jira ticket?

On Jun 2, 2017 9:13 AM, "Nick Pentreath"  wrote:

> Yeah you can just open a small follow up PR
> On Fri, 2 Jun 2017 at 15:10, davideis  wrote:
>
> > *@davideis* commented on this pull request.
> > --
> >
> > In mllib/src/test/scala/org/apache/spark/ml/
> recommendation/ALSSuite.scala
> > :
> >
> > > @@ -455,6 +487,22 @@ class ALSSuite
> > targetRMSE = 0.3)
> > }
> >
> > + test("implicit feedback regression") {
> > + val trainingWithNeg = sc.parallelize(Array(Rating(0, 0, 1), Rating(1,
> 1, 1), Rating(0, 1, -3)))
> > + val trainingWithZero = sc.parallelize(Array(Rating(0, 0, 1), Rating(1,
> 1, 1), Rating(0, 1, 0)))
> > + val modelWithNeg =
> > + trainALS(trainingWithNeg, rank = 1, maxIter = 5, regParam = 0.01,
> implicitPrefs = true)
> > + val modelWithZero =
> > + trainALS(trainingWithZero, rank = 1, maxIter = 5, regParam = 0.01,
> implicitPrefs = true)
> > + val userFactorsNeg = modelWithNeg.userFactors
> > + val itemFactorsNeg = modelWithNeg.itemFactors
> > + val userFactorsZero = modelWithZero.userFactors
> > + val itemFactorsZero = modelWithZero.itemFactors
> > + userFactorsNeg.collect().foreach(arr => logInfo(s"implicit test " +
> arr.mkString(" ")))
> >
> > Good point, I meant to remove, shall I open another pr?
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > , or
> mute
> > the thread
> >  SBxnsNsrJoMbMWP1I8Fvr9GAPHNxcks5sAAnGgaJpZM4NecIX>
> > .
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18022: [SPARK-20790] [MLlib] Correctly handle negative values f...

2017-06-02 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/18022
  
You can link the same JIRA since it's a small follow up
On Fri, 2 Jun 2017 at 15:16, davideis  wrote:

> Does it need another jira ticket?
>
> On Jun 2, 2017 9:13 AM, "Nick Pentreath"  wrote:
>
> > Yeah you can just open a small follow up PR
> > On Fri, 2 Jun 2017 at 15:10, davideis  wrote:
> >
> > > *@davideis* commented on this pull request.
> > > --
> > >
> > > In mllib/src/test/scala/org/apache/spark/ml/
> > recommendation/ALSSuite.scala
> > > :
> > >
> > > > @@ -455,6 +487,22 @@ class ALSSuite
> > > targetRMSE = 0.3)
> > > }
> > >
> > > + test("implicit feedback regression") {
> > > + val trainingWithNeg = sc.parallelize(Array(Rating(0, 0, 1), 
Rating(1,
> > 1, 1), Rating(0, 1, -3)))
> > > + val trainingWithZero = sc.parallelize(Array(Rating(0, 0, 1),
> Rating(1,
> > 1, 1), Rating(0, 1, 0)))
> > > + val modelWithNeg =
> > > + trainALS(trainingWithNeg, rank = 1, maxIter = 5, regParam = 0.01,
> > implicitPrefs = true)
> > > + val modelWithZero =
> > > + trainALS(trainingWithZero, rank = 1, maxIter = 5, regParam = 0.01,
> > implicitPrefs = true)
> > > + val userFactorsNeg = modelWithNeg.userFactors
> > > + val itemFactorsNeg = modelWithNeg.itemFactors
> > > + val userFactorsZero = modelWithZero.userFactors
> > > + val itemFactorsZero = modelWithZero.itemFactors
> > > + userFactorsNeg.collect().foreach(arr => logInfo(s"implicit test " +
> > arr.mkString(" ")))
> > >
> > > Good point, I meant to remove, shall I open another pr?
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub
> > > , or
> > mute
> > > the thread
> > >  > SBxnsNsrJoMbMWP1I8Fvr9GAPHNxcks5sAAnGgaJpZM4NecIX>
> > > .
> > >
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > , or
> mute
> > the thread
> > <
> 
https://github.com/notifications/unsubscribe-auth/AbQISk92QY-a_H3_8Fi10HD_B7FvyKsaks5sAAqQgaJpZM4NecIX
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18161: [MINOR][PYTHON] Ignore pep8 on test scripts generated in...

2017-06-02 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18161
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18152: [SPARK-20930][ML] Destroy broadcasted centers aft...

2017-06-02 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/18152#discussion_r119855432
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala ---
@@ -320,6 +320,7 @@ class LocalLDAModel private[spark] (
 
 docBound
   }.sum()
+ElogbetaBc.destroy(blocking = false)
--- End diff --

@zhengruifeng but the broadcast isn't actually used. Its `.value` is 
called, locally, not from a distributed method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18161: [MINOR][PYTHON] Ignore pep8 on test scripts gener...

2017-06-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18161


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10162: [SPARK-11250] [SQL] Generate different alias for columns...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/10162
  
Hi @NarineK, is there any opinion on ^?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18170: [SPARK-20942][WEB-UI]The title style about field is erro...

2017-06-02 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18170
  
Merged to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18170: [SPARK-20942][WEB-UI]The title style about field ...

2017-06-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18170


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18158: [SPARK-20936][CORE]Lack of an important case abou...

2017-06-02 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/18158#discussion_r119858243
  
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -461,19 +461,17 @@ class UtilsSuite extends SparkFunSuite with 
ResetSystemProperties with Logging {
 def assertResolves(before: String, after: String): Unit = {
   // This should test only single paths
   assume(before.split(",").length === 1)
-  // Repeated invocations of resolveURI should yield the same result
   def resolve(uri: String): String = Utils.resolveURI(uri).toString
+  assert(resolve(before) === after)
--- End diff --

I think the change is OK @HyukjinKwon -- we do want to check that `before` 
resolves to `after` and this case seems missed in this fixture. Or do I miss 
something? The rest of the change is just for consistency


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11887: [SPARK-13041][Mesos]add driver sandbox uri to the dispat...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/11887
  
gentle ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10162: [SPARK-11250] [SQL] Generate different alias for columns...

2017-06-02 Thread NarineK
Github user NarineK commented on the issue:

https://github.com/apache/spark/pull/10162
  
@HyukjinKwon do you mean closing or fixing the PR ? As I understand from 
@gatorsmile he wants to close it  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18158: [SPARK-20936][CORE]Lack of an important case abou...

2017-06-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18158#discussion_r119859774
  
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -461,19 +461,17 @@ class UtilsSuite extends SparkFunSuite with 
ResetSystemProperties with Logging {
 def assertResolves(before: String, after: String): Unit = {
   // This should test only single paths
   assume(before.split(",").length === 1)
-  // Repeated invocations of resolveURI should yield the same result
   def resolve(uri: String): String = Utils.resolveURI(uri).toString
+  assert(resolve(before) === after)
--- End diff --

Thanks for asking me. The initial change proposed looked testing several 
duplicated test cases and also changing existing tests to me. I am fine with 
the current status.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17882: [WIP][SPARK-20079][try 2][yarn] Re registration of AM ha...

2017-06-02 Thread witgo
Github user witgo commented on the issue:

https://github.com/apache/spark/pull/17882
  
@jerryshao @vanzin 
Would you take some time to review this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >