date:20140927

[GitHub] spark pull request: [SPARK-3705][SQL]add case for VoidObjectInspec...

2014-09-27 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2552#issuecomment-57077090
  
I tested for the time out issue, https://github.com/apache/spark/pull/1689 
lead to this issue, but have not found the root cause


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-3525] Adding gradient boosting

2014-09-27 Thread epahomov

Github user epahomov commented on the pull request:

https://github.com/apache/spark/pull/2394#issuecomment-57077001
  
Sorry for such messy pull request, I didn't review my student code close 
enough. Would try my best next time. We'll fix everything by the middle of the 
week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3032][Shuffle] Fix key comparison integ...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2514#issuecomment-57076884
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20928/consoleFull)
 for   PR 2514 at commit 
[`6f3c302`](https://github.com/apache/spark/commit/6f3c30263560853c4cfb5b65b74bce3e39801e05).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class IndexedRecordToJavaConverter extends Converter[IndexedRecord, 
JMap[String, Any]]`
  * `class AvroWrapperToJavaConverter extends Converter[Any, Any] `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3032][Shuffle] Fix key comparison integ...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2514#issuecomment-57076888
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20928/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an 1-hi...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1290#issuecomment-57076438
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20930/consoleFull)
 for   PR 1290 at commit 
[`804c07a`](https://github.com/apache/spark/commit/804c07a3abd6a0e81d0f04b4a08f88df29cad357).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57076212
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20929/consoleFull)
 for   PR 2542 at commit 
[`e9cd8be`](https://github.com/apache/spark/commit/e9cd8be5b69af54c1de3219cc8f2c0ad1718615a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2561#issuecomment-57076163
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-09-27 Thread saucam

GitHub user saucam opened a pull request:

https://github.com/apache/spark/pull/2561

SPARK-3711: Optimize where in clause filter queries

The In case class is replaced by a InSet class in case all the filters are 
literals, which uses a hashset instead of Sequence, thereby giving significant 
performance improvement (earlier the seq was using a worst case linear match 
(exists method) since expressions were assumed in the filter list) . Maximum 
improvement should be visible in case small percentage of large data matches 
the filter list.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/saucam/spark branch-1.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2561.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2561


commit bee98aadcea7cb8fa6402d72af45aef2a4de8c19
Author: Yash Datta 
Date:   2014-09-28T05:54:49Z

SPARK-3711: Optimize where in clause filter queries




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3032][Shuffle] Fix key comparison integ...

2014-09-27 Thread jerryshao

Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/2514#issuecomment-57076000
  
Hi Matei, thanks a lot for your suggestions. I've updated the code with 
fixed seed. Would you mind taking a look at this? Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] remaining cleanup work.

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2560#issuecomment-57075980
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20927/consoleFull)
 for   PR 2560 at commit 
[`9eff95a`](https://github.com/apache/spark/commit/9eff95afe6051b264854b415b5d305dc9e4bf3ef).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3032][Shuffle] Fix key comparison integ...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2514#issuecomment-57075989
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20928/consoleFull)
 for   PR 2514 at commit 
[`6f3c302`](https://github.com/apache/spark/commit/6f3c30263560853c4cfb5b65b74bce3e39801e05).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] remaining cleanup work.

2014-09-27 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/2560

[SPARK-3543] remaining cleanup work.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark TaskContext

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2560.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2560


commit 9eff95afe6051b264854b415b5d305dc9e4bf3ef
Author: Reynold Xin 
Date:   2014-09-28T05:43:57Z

[SPARK-3543] remaining cleanup work.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2533


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18127631
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+/**
+ * Grouping of data that is accessed by a CourseGrainedScheduler. This 
class
--- End diff --

Course -> Coarse. I will fix it during merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57075299
  
Thanks. Merging in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57075143
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/172/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3389] Add Converter for ease of Parquet...

2014-09-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2256


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57074968
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20926/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57074970
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20926/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3389] Add Converter for ease of Parquet...

2014-09-27 Thread mateiz

Github user mateiz commented on the pull request:

https://github.com/apache/spark/pull/2256#issuecomment-57074954
  
Alright, merged it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3325] Add a parameter to the method pri...

2014-09-27 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2216#issuecomment-57074623
  
I really wish that we could convert JavaDStreamLike / JavaRDDLike into 
abstract base classes instead of traits, since there's no particular reason why 
they should be implemented as traits (it's an unfortunate carry-over from an 
earlier Java API design prototype that we didn't wind up using and which nobody 
caught and removed before 1.0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57074476
  
@tianyi 

CREATE TABLE t1(x INT);
CREATE TABLE t2(a STRUCT, k INT);
SELECT a.x FROM t1 a JOIN t2 b ON a.x = b.k;
But hive can resolve this as @liancheng said. What's the magic here for the 
`ON` statement?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-09-27 Thread mateiz

Github user mateiz commented on the pull request:

https://github.com/apache/spark/pull/2524#issuecomment-57074480
  
It's probably easiest to move the accumulator update to TaskSetManager or 
to the part of DAGScheduler that reports the result to the user. It's right 
below the current update in the code:
```
if (!job.finished(rt.outputId)) {
  job.finished(rt.outputId) = true
  ...
```
That happens only once per task, so it's a good place to do the update for 
ResultTask. For ShuffleMapTask you can do it in the corresponding match 
statement as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-09-27 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/2524#issuecomment-57074398
  
I can simply monitor the accumulator update in TaskSetManager, just not 
sure if that can maximumly resolve the problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-09-27 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/2524#issuecomment-57074388
  
the drawbacks for us not to de-duplicate in shuffle stage is that, it makes 
accumulator usage to be very tricky...

it sounds like you are not encouraged to use accumulator in a 
transformation, especially when the involved stage is shared by multiple jobs 
or your cluster is not that stable

for adding flag, just provide flexibility for the user to choose whether 
they would like to accept duplicate update


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread tianyi

Github user tianyi commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57074276
  
@cloud-fan, I think it is reasonable for return "ambiguous references" in 
the case you mentioned, because we can't make sure whether 'a' is a table alias 
or column name. In my last commits, spark will return "ambiguous references" 
for your case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-09-27 Thread mateiz

Github user mateiz commented on the pull request:

https://github.com/apache/spark/pull/2524#issuecomment-57074233
  
Basically it would be great to get a really simple patch that *only* fixes 
SPARK-3628 and adds no new data structures in DAGScheduler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-732][SPARK-3628][CORE][RESUBMIT] make i...

2014-09-27 Thread mateiz

Github user mateiz commented on the pull request:

https://github.com/apache/spark/pull/2524#issuecomment-57074221
  
Let's not de-duplicate in shuffle stages please. That complicates the patch 
a lot and I'm not sure why people would necessarily use it.

Also, why did you add a duplicate flag to Accumulator? IMO we shouldn't 
expose this as an option. Again it adds complexity in what should just be a bug 
fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57073173
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/172/consoleFull)
 for   PR 2542 at commit 
[`a018641`](https://github.com/apache/spark/commit/a018641924fd60ebf54c05990a10001cf9a65a0c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57073139
  
@liancheng Hmm..I don't have a hive environment for test...

CREATE TABLE t1(x INT);
CREATE TABLE t2(a STRUCT, k INT);
SELECT a.x FROM t1 a JOIN t2 b;
Without this PR, spark sql will report ambiguousReferences, but how hive 
resolve `a.x`? "table a, column x" or "table b, column a.x"?
Or do we have to add the `ON` statement?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57073055
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20925/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57072994
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20926/consoleFull)
 for   PR 2542 at commit 
[`a018641`](https://github.com/apache/spark/commit/a018641924fd60ebf54c05990a10001cf9a65a0c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18127069
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -73,31 +75,35 @@ case class GetItem(child: Expression, ordinal: 
Expression) extends Expression {
 /**
  * Returns the value of fields in the Struct `child`.
  */
-case class GetField(child: Expression, fieldName: String) extends 
UnaryExpression {
+case class GetField(child: Expression, field: StructField, ordinal: Int) 
extends UnaryExpression {
   type EvaluatedType = Any
 
   def dataType = field.dataType
   override def nullable = child.nullable || field.nullable
   override def foldable = child.foldable
 
-  protected def structType = child.dataType match {
-case s: StructType => s
-case otherType => sys.error(s"GetField is not valid on fields of type 
$otherType")
-  }
-
-  lazy val field =
-structType.fields
-.find(_.name == fieldName)
-.getOrElse(sys.error(s"No such field $fieldName in 
${child.dataType}"))
-
-  lazy val ordinal = structType.fields.indexOf(field)
-
-  override lazy val resolved = childrenResolved && 
child.dataType.isInstanceOf[StructType]
--- End diff --

Currently all `GetField`s are resolved as I try to resolve them in the 
`ResolveGetField` rule. If it can't be resolved(field not exists in Strct 
etc.), the rule will throw Exception. That's why I removed the `resolved` field.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18127040
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -366,7 +366,7 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   case base ~ _ ~ ordinal => GetItem(base, ordinal)
 } |
 (expression <~ ".") ~ ident ^^ {
-  case base ~ fieldName => GetField(base, fieldName)
+  case base ~ fieldName => UnresolvedGetField(base, fieldName)
--- End diff --

Actually I need this rule always take action as I want to support another 
type of `GetField` like `GetFieldFromStruct`, `GetFieldFromArray`, etc. Anyway 
I think `UnresolvedGetField` is not necessary for this PR, I will use 
`resolved` field instead and put this into related PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18126983
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -73,31 +75,35 @@ case class GetItem(child: Expression, ordinal: 
Expression) extends Expression {
 /**
  * Returns the value of fields in the Struct `child`.
  */
-case class GetField(child: Expression, fieldName: String) extends 
UnaryExpression {
+case class GetField(child: Expression, field: StructField, ordinal: Int) 
extends UnaryExpression {
   type EvaluatedType = Any
 
   def dataType = field.dataType
   override def nullable = child.nullable || field.nullable
   override def foldable = child.foldable
 
-  protected def structType = child.dataType match {
-case s: StructType => s
-case otherType => sys.error(s"GetField is not valid on fields of type 
$otherType")
-  }
-
-  lazy val field =
-structType.fields
-.find(_.name == fieldName)
-.getOrElse(sys.error(s"No such field $fieldName in 
${child.dataType}"))
-
-  lazy val ordinal = structType.fields.indexOf(field)
-
-  override lazy val resolved = childrenResolved && 
child.dataType.isInstanceOf[StructType]
-
   override def eval(input: Row): Any = {
 val baseValue = child.eval(input).asInstanceOf[Row]
 if (baseValue == null) null else baseValue(ordinal)
   }
 
-  override def toString = s"$child.$fieldName"
+  override def toString = s"$child.${field.name}"
+}
+
+object GetField {
--- End diff --

I was going to put this logic into Analyzer rule, but found some tests 
depend on `GetField(child, fieldName)`, so I have to create this constructor of 
`GetField`. And these two are so similar, so I combine them together. Maybe I 
should fix those tests instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-27 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2416#issuecomment-57072352
  
BTW the reason why abstract classes are favored over traits is because 
traits (as with many advanced scala features) with default impls complicate a 
lot of things, especially when it comes to java and binary compatibility.

I guess in this case it might be ok because Spillable is not a public thing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-27 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2416#issuecomment-57072266
  
Are we going to use this in multiple unrelated classes? As far as I can 
tell, this is only used for collections ... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2416#issuecomment-57072024
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20924/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2416#issuecomment-57072022
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20924/consoleFull)
 for   PR 2416 at commit 
[`cf8be9a`](https://github.com/apache/spark/commit/cf8be9a59f1dbca3d0dcfbd973c3858b6fa50d50).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread adrian-wang

Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18126862
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,52 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =>
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare,
+  // even if both sides are of same type, as Hive use xxxwritable to 
compare.
--- End diff --

The native `compareTo` of both `java.sql.Date` point to `java.util.Date`, 
which compares the millis since epoch, but `DateWritable` compares days since 
epoch, so here is the gap.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3584] sbin/slaves doesn't work when we ...

2014-09-27 Thread jameszhouyi

Github user jameszhouyi commented on the pull request:

https://github.com/apache/spark/pull/2444#issuecomment-57071944
  
Hi @pwendell ,
After this commit, for spark-perf will complain 'not found slaves' when run 
./bin/run... so have to modify from slaves.template to slaves manually ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread tianyi

Github user tianyi commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57071901
  
Hi, @marmbrus . The current codes still have some bugs to fix, I talked 
@liancheng yesterday, I will push a update later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

2014-09-27 Thread ezhulenev

Github user ezhulenev commented on the pull request:

https://github.com/apache/spark/pull/1717#issuecomment-57071591
  
@sjbrunst aargh, TwitterStreamSuite.scala:53 requred to add count parameter


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3658][SQL]Take thrift server as a daemo...

2014-09-27 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2509#discussion_r18126747
  
--- Diff: sbin/spark-daemon.sh ---
@@ -142,8 +142,12 @@ case $startStop in
 
 spark_rotate_log "$log"
 echo starting $command, logging to $log
-cd "$SPARK_PREFIX"
-nohup nice -n $SPARK_NICENESS "$SPARK_PREFIX"/bin/spark-class $command 
"$@" >> "$log" 2>&1 < /dev/null &
+if [ $option == spark-submit ]; then
+  nohup nice -n $SPARK_NICENESS "$SPARK_PREFIX"/bin/spark-submit 
--class $command \
--- End diff --

Hm, that's fair. I'd use `export` in `sbin/start-thriftserver.sh` to fix 
this issue (exported environment variables are accessible in bash subprocesses):

```bash
export SUBMIT_USAGE_FUNCTION=usage
exec "$FWDIR"/sbin/spark-daemon.sh spark-submit $CLASS 1
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3658][SQL]Take thrift server as a daemo...

2014-09-27 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2509#discussion_r18126738
  
--- Diff: sbin/spark-daemon.sh ---
@@ -142,8 +142,12 @@ case $startStop in
 
 spark_rotate_log "$log"
 echo starting $command, logging to $log
-cd "$SPARK_PREFIX"
-nohup nice -n $SPARK_NICENESS "$SPARK_PREFIX"/bin/spark-class $command 
"$@" >> "$log" 2>&1 < /dev/null &
+if [ $option == spark-submit ]; then
+  nohup nice -n $SPARK_NICENESS "$SPARK_PREFIX"/bin/spark-submit 
--class $command \
--- End diff --

Hm, that's fair. I'd use `export` in `sbin/start-thriftserver.sh` before 
`exec` to fix this issue (exported environment variables can be accessed by 
bash subprocesses):

```bash
export SUBMIT_USAGE_FUNCTION=usage
exec "$FWDIR"/bin/spark-submit --class $CLASS "${SUBMISSION_OPTS[@]}" 
spark-internal "${APPLICATION_OPTS[@]}"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread adrian-wang

Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18126726
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,52 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =>
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare,
+  // even if both sides are of same type, as Hive use xxxwritable to 
compare.
--- End diff --

I considered this question again and now think when comparing same types, 
it is better not to convert to string but write `compareTo` methods for them, 
since the native `compareTo` method of `java.sql.date` seems not consistent 
with `DateWritable`. I'll do a quick follow up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-27 Thread jimjh

Github user jimjh commented on the pull request:

https://github.com/apache/spark/pull/2416#issuecomment-57070773
  
@rxin Thanks for your feedback. I agree with almost all of your comments 
and made the appropriate changes. However, I don't think it should be an 
abstract class. According to Programming in Scala: _To trait or not to trait_,

> If it might be reused in multiple, unrelated classes, make it a trait. 
Only traits can be mixed into different parts of the class hierarchy.

`Spillable` seems to fit that criteria.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2416#issuecomment-57070772
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20924/consoleFull)
 for   PR 2416 at commit 
[`cf8be9a`](https://github.com/apache/spark/commit/cf8be9a59f1dbca3d0dcfbd973c3858b6fa50d50).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126678
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
+  new WorkerOffer( id, executorData.executorHost, 
executorData.freeCores)}.toSeq))
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126677
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126669
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +176,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorDataMap(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorDataMap.get(executorId) match {
+case Some(executorInfo) =>
+  val numCores = executorInfo.totalCores
--- End diff --

expression inlined


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126667
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -297,6 +291,7 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 
+
--- End diff --

removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126665
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -85,16 +79,14 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 def receiveWithLogging = {
   case RegisterExecutor(executorId, hostPort, cores) =>
 Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
-if (executorActor.contains(executorId)) {
+if (executorDataMap.contains(executorId)) {
   sender ! RegisterExecutorFailed("Duplicate executor ID: " + 
executorId)
 } else {
   logInfo("Registered executor: " + sender + " with ID " + 
executorId)
   sender ! RegisteredExecutor
-  executorActor(executorId) = sender
-  executorHost(executorId) = Utils.parseHostPort(hostPort)._1
-  totalCores(executorId) = cores
-  freeCores(executorId) = cores
-  executorAddress(executorId) = sender.path.address
+  executorDataMap.put(executorId,  new ExecutorData(sender, 
sender.path.address,
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126663
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -126,8 +120,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 
   case StopExecutors =>
 logInfo("Asking each executor to shut down")
-for (executor <- executorActor.values) {
-  executor ! StopExecutor
+for ((_,executorData) <- executorDataMap) {
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126658
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
+   var executorActor: ActorRef,
--- End diff --

Good point - All but freeCores changed to vals


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126545
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -104,13 +96,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   case StatusUpdate(executorId, taskId, state, data) =>
 scheduler.statusUpdate(taskId, state, data.value)
 if (TaskState.isFinished(state)) {
-  if (executorActor.contains(executorId)) {
-freeCores(executorId) += scheduler.CPUS_PER_TASK
-makeOffers(executorId)
-  } else {
-// Ignoring the update since we don't know about the executor.
-val msg = "Ignored task status update (%d state %s) from 
unknown executor %s with ID %s"
-logWarning(msg.format(taskId, state, sender, executorId))
+  executorDataMap.get(executorId) match {
+case Some(executorInfo) =>
+  executorInfo.freeCores += scheduler.CPUS_PER_TASK
+  makeOffers(executorId)
+case None =>
+  // Ignoring the update since we don't know about the 
executor.
+  val msg = "Ignored task status update (%d state %s) " +
--- End diff --

Done and replaced format with a 's' interpolated string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126510
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
+   var executorActor: ActorRef,
+   var executorAddress: Address,
+   var executorHost: String ,
+   var freeCores: Int,
+   var totalCores: Int
+) {}
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll

Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126499
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3705][SQL]add case for VoidObjectInspec...

2014-09-27 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/2552#issuecomment-57069344
  
@marmbrus , it seems all PR of SQL tests timed out


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-27 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1486#issuecomment-57068065
  
Hm this exclusion might not work in the case that a class is changed to an 
interface. Maybe just also add the specific recommended exclusion here:

```

ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.scheduler.TaskLocation")
```

Once this passes tests LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Clean up Java TaskContext impleme...

2014-09-27 Thread nchammas

Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/2557#issuecomment-57067407
  
@rxin The [block that sets this 
message](https://github.com/apache/spark/blob/5b922bb458e863f5be0ae68167de882743f70b86/dev/run-tests-jenkins#L89)
 is driven by an environment variable set outside the scope of the Jenkins 
script in this repo, so that must be mis-set somehow.

Perhaps it's related to some work @JoshRosen said he was going to be doing 
with the AMPLab team this weekend to fix the double-posting of test result 
messages to GitHub?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Clean up Java TaskContext impleme...

2014-09-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2557


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Clean up Java TaskContext impleme...

2014-09-27 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2557#issuecomment-57066811
  
Ok merging this now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Clean up Java TaskContext impleme...

2014-09-27 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2557#issuecomment-57066786
  
@nchammas any idea why it says does not merge cleanly even though it does?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3704][SQL]ColumnValue types not match i...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2551#issuecomment-57066017
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20923/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3705][SQL]add case for VoidObjectInspec...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2552#issuecomment-57066008
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20922/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3704][SQL]ColumnValue types not match i...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2551#issuecomment-57066016
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20923/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3705][SQL]add case for VoidObjectInspec...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2552#issuecomment-57066007
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20922/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57065881
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20921/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57065879
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20921/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3707] [SQL] Fix bug of type coercion in...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2559#issuecomment-57065586
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/170/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3699: SQL and Hive console tasks now cle...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2547#issuecomment-57064275
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/171/consoleFull)
 for   PR 2547 at commit 
[`d5e431f`](https://github.com/apache/spark/commit/d5e431f0a1b9047a5afc27cb371dbfb7014fb6e0).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3681] [SQL] [PySpark] fix serialization...

2014-09-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2526


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3681] [SQL] [PySpark] fix serialization...

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2526#issuecomment-57062871
  
Thanks! Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2543#issuecomment-57062687
  
Thanks for working on this!  A few minor comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18125297
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -73,31 +75,35 @@ case class GetItem(child: Expression, ordinal: 
Expression) extends Expression {
 /**
  * Returns the value of fields in the Struct `child`.
  */
-case class GetField(child: Expression, fieldName: String) extends 
UnaryExpression {
+case class GetField(child: Expression, field: StructField, ordinal: Int) 
extends UnaryExpression {
   type EvaluatedType = Any
 
   def dataType = field.dataType
   override def nullable = child.nullable || field.nullable
   override def foldable = child.foldable
 
-  protected def structType = child.dataType match {
-case s: StructType => s
-case otherType => sys.error(s"GetField is not valid on fields of type 
$otherType")
-  }
-
-  lazy val field =
-structType.fields
-.find(_.name == fieldName)
-.getOrElse(sys.error(s"No such field $fieldName in 
${child.dataType}"))
-
-  lazy val ordinal = structType.fields.indexOf(field)
-
-  override lazy val resolved = childrenResolved && 
child.dataType.isInstanceOf[StructType]
--- End diff --

Here we can check to see if the field actually exists in Struct, otherwise 
`resolved = false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3704][SQL]ColumnValue types not match i...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2551#issuecomment-57062670
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20923/consoleFull)
 for   PR 2551 at commit 
[`08bcc59`](https://github.com/apache/spark/commit/08bcc5965fc17ac0e797fb501e815b71e5b2b64e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18125293
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -118,6 +119,19 @@ class Analyzer(catalog: Catalog, registry: 
FunctionRegistry, caseSensitive: Bool
   }
 
   /**
+   * Replaces [[UnresolvedGetField]]s with concrete [[GetField]]
+   */
+  object ResolveGetField extends Rule[LogicalPlan] {
--- End diff --

Rules aren't that much overhead.  I think this is good :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18125291
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -366,7 +366,7 @@ class SqlParser extends StandardTokenParsers with 
PackratParsers {
   case base ~ _ ~ ordinal => GetItem(base, ordinal)
 } |
 (expression <~ ".") ~ ident ^^ {
-  case base ~ fieldName => GetField(base, fieldName)
+  case base ~ fieldName => UnresolvedGetField(base, fieldName)
--- End diff --

Instead of creating a new type of `GetField`, can we just use the 
`resolved` field in the existing one determine when the rule needs to take 
action?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3705][SQL]add case for VoidObjectInspec...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2552#issuecomment-57062663
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20922/consoleFull)
 for   PR 2552 at commit 
[`453d892`](https://github.com/apache/spark/commit/453d892242cdacebf383bc3a2d61c351ad0b8c37).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3698][SQL] Correctly check case sensiti...

2014-09-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2543#discussion_r18125290
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -73,31 +75,35 @@ case class GetItem(child: Expression, ordinal: 
Expression) extends Expression {
 /**
  * Returns the value of fields in the Struct `child`.
  */
-case class GetField(child: Expression, fieldName: String) extends 
UnaryExpression {
+case class GetField(child: Expression, field: StructField, ordinal: Int) 
extends UnaryExpression {
   type EvaluatedType = Any
 
   def dataType = field.dataType
   override def nullable = child.nullable || field.nullable
   override def foldable = child.foldable
 
-  protected def structType = child.dataType match {
-case s: StructType => s
-case otherType => sys.error(s"GetField is not valid on fields of type 
$otherType")
-  }
-
-  lazy val field =
-structType.fields
-.find(_.name == fieldName)
-.getOrElse(sys.error(s"No such field $fieldName in 
${child.dataType}"))
-
-  lazy val ordinal = structType.fields.indexOf(field)
-
-  override lazy val resolved = childrenResolved && 
child.dataType.isInstanceOf[StructType]
-
   override def eval(input: Row): Any = {
 val baseValue = child.eval(input).asInstanceOf[Row]
 if (baseValue == null) null else baseValue(ordinal)
   }
 
-  override def toString = s"$child.$fieldName"
+  override def toString = s"$child.${field.name}"
+}
+
+object GetField {
--- End diff --

If possible, I think it might be clearer to keep the resolver logic in the 
Analyzer rule.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3680][SQL] Fix bug caused by eager typi...

2014-09-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2525


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3704][SQL]ColumnValue types not match i...

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2551#issuecomment-57062542
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3676][SQL]spark sql hive test suite fai...

2014-09-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2517


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3705][SQL]add case for VoidObjectInspec...

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2552#issuecomment-57062534
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57062530
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20921/consoleFull)
 for   PR 2542 at commit 
[`252dbbb`](https://github.com/apache/spark/commit/252dbbbdaab653a94aa784873ac362b4422494e1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3676][SQL]spark sql hive test suite fai...

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2517#issuecomment-57062523
  
Thanks! I've merged this to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3699: SQL and Hive console tasks now cle...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2547#issuecomment-57062442
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/171/consoleFull)
 for   PR 2547 at commit 
[`d5e431f`](https://github.com/apache/spark/commit/d5e431f0a1b9047a5afc27cb371dbfb7014fb6e0).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3688][SQL]LogicalPlan can't resolve col...

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2542#issuecomment-57062370
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57062345
  
Sorry for the delay, this week has been very busy!  I'd like to merge this 
soon, only one small question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18125250
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,52 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =>
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare,
+  // even if both sides are of same type, as Hive use xxxwritable to 
compare.
--- End diff --

Can you explain this more?  It seems more expensive to convert to strings 
and then compare strings instead of just comparing the underlying types.  What 
does writables have to do with this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3707] [SQL] Fix bug of type coercion in...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2559#issuecomment-57062261
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/170/consoleFull)
 for   PR 2559 at commit 
[`199a85d`](https://github.com/apache/spark/commit/199a85d2e7ef482f3c0ac9cacc4dbeb2a21d5901).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

2014-09-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1717#issuecomment-57060931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20920/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1717#issuecomment-57060929
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20920/consoleFull)
 for   PR 1717 at commit 
[`4b5b09d`](https://github.com/apache/spark/commit/4b5b09d5a70a120ebd8f9f13ea3ba77611d06b10).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class BoundingBox(west: Double, south: Double, east: Double, 
north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3325] Add a parameter to the method pri...

2014-09-27 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/2216#issuecomment-57060863
  
Yeah so I looked into it a bit more and since `JavaDStream` extends 
`JavaDStreamLike` this will break user code with custom DStream's. The issue is 
that under the hood those user classes have been compiled to implement an 
interface called `JavaDStreamLike` and older ones won't have the forwarder 
method in the interface.

In this case I think there is a straightforward workaround of just adding 
`print(num)`. To the concrete classes `JavaDStream` and `JavaPairDStream`. It 
will have some code re-use both with `print` and with each other, but it will 
work.





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3325] Add a parameter to the method pri...

2014-09-27 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/2216#issuecomment-57060715
  
Ah I see - I thought this was an abstract class instead of a trait being 
modified in this patch.

This is not an error with the compatibility checker - it's a legitimate 
break. Because of the way traits work in Scala, you cannot add a new method 
even if it has a default implementation. It's more like an interface in that 
regard. For this reason we usually try to avoid traits for public-facing things 
that could be implemented as abstract classes.

However, it will only break if someone outside of Spark has written a class 
that extends this trait directly or indirectly. @JoshRosen is the design here 
that this trait should ever be used outside of Spark?

You can look on Slide 10 here to see why:
http://www.slideshare.net/mircodotta/managing-binary-compatibility-in-scala


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3478] [PySpark] Profile the Python task...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2556#issuecomment-57060226
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/169/consoleFull)
 for   PR 2556 at commit 
[`e68df5a`](https://github.com/apache/spark/commit/e68df5a2ada0044f76d748f4e5dd250a1928812b).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

2014-09-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1717#issuecomment-57059419
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20920/consoleFull)
 for   PR 1717 at commit 
[`4b5b09d`](https://github.com/apache/spark/commit/4b5b09d5a70a120ebd8f9f13ea3ba77611d06b10).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

2014-09-27 Thread sjbrunst

Github user sjbrunst commented on the pull request:

https://github.com/apache/spark/pull/1717#issuecomment-57059353
  
@ezhulenev I've rolled back those changes now. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

2014-09-27 Thread ezhulenev

Github user ezhulenev commented on the pull request:

https://github.com/apache/spark/pull/1717#issuecomment-57059126
  
@sjbrunst you need to rollback your changes in TwitterAlgebirdCMD & 
TwitterAlgebirdHLL  (remove Nil for locations), and after that project will 
compile and I should pass all tests. I tried it locally but didn't commit to my 
repo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 166 matches

Mail list logo