[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

2015-04-04 Thread debasish83
Github user debasish83 commented on the pull request:

https://github.com/apache/spark/pull/3098#issuecomment-89729377
  
I meant MAP...what's the MAP on netflix dataset you have seen before and 
with what lambda ? I am running MAP experiments with various factorization 
formulations including loglikelihood loss with normalization constraints...also 
how do you define MAP for implicit feedback (binary dataset, click is 1 and no 
click is 0) ? In the label set every rating is 1.0 and so there is no ranking 
defined as such...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6712][YARN] Allow lower the log level i...

2015-04-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/5362#issuecomment-89729144
  
Writing to stdout/stderr defeats the point of a logging framework, no. I 
think you could argue that some of these other messages aren't vital at log 
level ("setting up", "preparing", etc.) and turn those down.

Otherwise I'm afraid logging configuration can't be controlled at the level 
of statements by end users, and others would say these are useful enough / not 
a problem enough to disable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2808][Streaming][Kafka] update kafka to...

2015-04-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4537#issuecomment-89728917
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

2015-04-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/3098#issuecomment-89728639
  
@debasish83 do you mean RMSE? it is well-defined but not very useful. MAP 
is the useful metric. I think that only a rank-dependent metric makes sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6712][YARN] Allow lower the log level i...

2015-04-04 Thread piaozhexiu
Github user piaozhexiu commented on the pull request:

https://github.com/apache/spark/pull/5362#issuecomment-89728738
  
@srowen I'd like to turn down pretty much every INFO message from YARN 
client except the AM url. (See below.) As can be seen, none of these is useful 
for end users except for the AM url. Unfortunately, I can't selectively turn 
down other messages since they're all in the same package.

How about if I print the tracking url to stdout and leave the INFO log as 
is? Then, I can turn off INFO in YARN client.

-
15/04/05 06:36:29 INFO Configuration.deprecation: 
mapred.input.dir.recursive is deprecated. Instead, use 
mapreduce.input.fileinputformat.input.dir.recursive
15/04/05 06:36:29 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/04/05 06:36:29 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/04/05 06:36:29 INFO Configuration.deprecation: 
mapred.min.split.size.per.rack is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.rack
15/04/05 06:36:29 INFO Configuration.deprecation: 
mapred.min.split.size.per.node is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.node
15/04/05 06:36:29 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
15/04/05 06:36:29 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
15/04/05 06:36:29 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
15/04/05 06:36:29 INFO spark.SparkContext: Running Spark version 1.3.0
15/04/05 06:36:30 INFO spark.SecurityManager: Changing view acls to: 
cheolsoop
15/04/05 06:36:30 INFO spark.SecurityManager: Changing modify acls to: 
cheolsoop
15/04/05 06:36:30 INFO spark.SecurityManager: SecurityManager: 
authentication disabled; ui acls disabled; users with view permissions: 
Set(cheolsoop); users with modify permissions: Set(cheolsoop)
15/04/05 06:36:30 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/04/05 06:36:30 INFO Remoting: Starting remoting
15/04/05 06:36:30 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkDriver@ip-10-99-146-254.ec2.internal:64877]
15/04/05 06:36:30 INFO util.Utils: Successfully started service 
'sparkDriver' on port 64877.
15/04/05 06:36:30 INFO spark.SparkEnv: Registering MapOutputTracker
15/04/05 06:36:30 INFO spark.SparkEnv: Registering BlockManagerMaster
15/04/05 06:36:30 INFO storage.DiskBlockManager: Created local directory at 
/mnt/spark_tmp/spark-e95cf4af-ec65-469a-ad1a-827d1149eeab/blockmgr-ec922fd7-58c5-497a-a952-247fcb3ab779
15/04/05 06:36:30 INFO storage.MemoryStore: MemoryStore started with 
capacity 265.4 MB
15/04/05 06:36:31 INFO spark.HttpFileServer: HTTP File server directory is 
/mnt/spark_tmp/spark-84f23ed2-ebf3-4022-93e1-fbb31325ab3f/httpd-128e4efa-c666-4311-b1ee-c0868eaca4bc
15/04/05 06:36:31 INFO spark.HttpServer: Starting HTTP Server
15/04/05 06:36:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/04/05 06:36:31 INFO server.AbstractConnector: Started 
SocketConnector@0.0.0.0:36543
15/04/05 06:36:31 INFO util.Utils: Successfully started service 'HTTP file 
server' on port 36543.
15/04/05 06:36:31 INFO spark.SparkEnv: Registering OutputCommitCoordinator
15/04/05 06:36:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/04/05 06:36:31 INFO server.AbstractConnector: Started 
SelectChannelConnector@0.0.0.0:47936
15/04/05 06:36:31 INFO util.Utils: Successfully started service 'SparkUI' 
on port 47936.
15/04/05 06:36:31 INFO ui.SparkUI: Started SparkUI at 
http://ip-10-99-146-254.ec2.internal:47936
15/04/05 06:36:31 INFO client.RMProxy: Connecting to ResourceManager at 
/10.171.119.231:9022
15/04/05 06:36:31 INFO yarn.Client: Requesting a new application from 
cluster with 1300 NodeManagers
15/04/05 06:36:31 INFO yarn.Client: Verifying our application has not 
requested more than the maximum memory capability of the cluster (10240 MB per 
container)
15/04/05 06:36:31 INFO yarn.Client: Will allocate AM container, with 896 MB 
memory including 384 MB overhead
15/04/05 06:36:31 INFO yarn.Client: Setting up container launch context for 
our AM
15/04/05 06:36:31 INFO yarn.Client: Preparing resources for our AM container
15/04/05 06:36:32 INFO yarn.Client: Uploading resource 
file:/mnt/tmp/bdp-clients/cheolsoop/20150405_063623.027801.prodsparkshell13/jars/spark-1.3.0/lib/spark-assembly-1.3.1-SNAPSHOT-hadoop2.4.0.jar
 -> 
hdfs://10.171.119.231:9000/user/cheolsoop/.sparkStaging/application_1426271585556_249126/spark-assembly-1.3.1-SNAPSHOT-hadoop2.4.0.jar
15/04/05 06:36:35 INFO y

[GitHub] spark pull request: [SPARK-6712][YARN] Allow lower the log level i...

2015-04-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/5362#issuecomment-89727695
  
No, println isn't appropriate here. That removes control over the logging 
entirely. Instead, what log messages do you find noisy? maybe they can be 
turned *down* since this message is appropriate at the info level. Or, can you 
not just selectively disable messages from packages in your log4j config? or is 
the noise from the same package?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6712][YARN] Allow lower the log level i...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5362#issuecomment-89724217
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6712][YARN] Allow lower the log level i...

2015-04-04 Thread piaozhexiu
GitHub user piaozhexiu opened a pull request:

https://github.com/apache/spark/pull/5362

[SPARK-6712][YARN] Allow lower the log level in YARN client while keeping 
AM tracking URL printed

In YARN mode, log messages are quite verbose in interactive shells 
(spark-shell, spark-sql, pyspark), and they sometimes mingle with shell 
prompts. In fact, it's very easy to tone it down via log4j.properties, but the 
problem is that the AM tracking URL is not printed if I do that.

It would be nice if I could keep the AM tracking URL while disabling the 
other INFO messages that don't matter to most end users.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/piaozhexiu/spark SPARK-6712

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5362


commit 9b0a047e1e34e351f22329156efb50f4a452e091
Author: Cheolsoo Park 
Date:   2015-04-05T05:57:05Z

Use println instead of logInfo to print AM tracking url in YARN client




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-04 Thread scwf
Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-89716003
  
@maropu , yeah i think it is a common case for yarn mode. We often specify 
more executors than nodemanager, that means there are more than one executor on 
one machine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-6698: where RandomForest input specifies...

2015-04-04 Thread bien
Github user bien commented on the pull request:

https://github.com/apache/spark/pull/5351#issuecomment-89713636
  
The behavior I was seeing was that RandomTree training tasks were spending 
~90% of their time doing GC, and when I turned on verbose GC I would see that 
most of the time was spent (fruitlessly) on older generation objects.  I 
assumed the baggedInput RDD was the culprit because there were no other RDDs in 
my code (other than the original input), and this patch did help things 
somewhat.  Under these circumstances I don't have a problem spending time 
deserializing objects or creating objects in the younger generation.  

>> An explicit parameter with a reasonable default might be better than 
making users persist RDDs as a way of specifying the parameter

This sounds fine to me but I don't know the Spark codebase well enough to 
contribute this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

2015-04-04 Thread debasish83
Github user debasish83 commented on the pull request:

https://github.com/apache/spark/pull/3098#issuecomment-89706247
  
@coderxiang @mengxr If I have a dataset with implicit (click or 0) then MAP 
is not that well defined right since in label set everything is 1.0 and so 
there is no ordering definedshould we add a rank independent metric for 
implicit datasets ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-04 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r27773756
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 """
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger("Cloud.Transport")
--- End diff --

I have an open issue to [replace cloudpickle with 
Dill](https://issues.apache.org/jira/browse/SPARK-4898), but I think it's still 
blocked by some [open issues](https://github.com/uqfoundation/dill/issues/50) 
against the Dill project.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

2015-04-04 Thread debasish83
Github user debasish83 commented on the pull request:

https://github.com/apache/spark/pull/3098#issuecomment-89697236
  
@srowen For netflix dataset what's the MAP you have seen before...I started 
experiments on Netflix dataset...lambda is 0.065 for netflix as well right ? 
For MovieLens 0.065 works well...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-04 Thread nchammas
Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-89697205
  
> TODO: ec2/spark-ec2.py is not fully tested with python3.

I can help with this. Do we want to hold off other spark-ec2 PRs until this 
one goes in? Do we have a rough goal for when we want to merge this in?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-04 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r27773735
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -116,7 +114,7 @@ def __init__(self, func, returnType):
 
 def _create_judf(self):
 f = self.func  # put it in closure `func`
-func = lambda _, it: imap(lambda x: f(*x), it)
+func = lambda _, it: map(lambda x: f(*x), it)
--- End diff --

A common approach I've seen in projects wanting to support both Python 2 
and 3 is to use the [`six`](https://pythonhosted.org/six/) compatibility 
module, which has [support for renamed 
methods](https://pythonhosted.org/six/#module-six.moves).

```
from six.moves import map
```

We probably don't want to add another external dependency, but just thought 
I'd throw that out there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2808][Streaming][Kafka] update kafka to...

2015-04-04 Thread zzcclp
Github user zzcclp commented on the pull request:

https://github.com/apache/spark/pull/4537#issuecomment-89694634
  
@koeninger , I can't visit [this 
url](https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28872/)
 , it's 404. ??


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6661] Python type errors should print t...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5361#issuecomment-89686461
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29717/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6661] Python type errors should print t...

2015-04-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5361#issuecomment-89671362
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6661] Python type errors should print t...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5361#issuecomment-89661772
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6661] Python type errors should print t...

2015-04-04 Thread 31z4
GitHub user 31z4 opened a pull request:

https://github.com/apache/spark/pull/5361

[SPARK-6661] Python type errors should print type, not object



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/31z4/spark spark-6661

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5361


commit f8a3ef83bdbf4dc0cf93a2002a720a74ab2eb47d
Author: Elisey Zanko 
Date:   2015-04-04T20:39:25Z

[SPARK-6661] Python type errors should print type, not object




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL][SPARK-6632]: Read schema from each ...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5298#issuecomment-89639987
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29716/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5990] [MLLIB] Model import/export for I...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5270#issuecomment-89639900
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29715/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core] Replace direct use of Akka ...

2015-04-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5268


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core] Replace direct use of Akka ...

2015-04-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5268#issuecomment-89639203
  
Merging this in master. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6264] [MLLIB] Support FPGrowth algorith...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5213#issuecomment-89633708
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29713/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5684][SQL]: Pass in partition name alon...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4469#issuecomment-89633239
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29712/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL][SPARK-6632]: Read schema from each ...

2015-04-04 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/5298#issuecomment-89632303
  
hmm i see. Would definitely go through these PRs. Anyways fixed the 
whitespace problem here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-975][CORE] Visual debugger of stages an...

2015-04-04 Thread wbraik
Github user wbraik commented on the pull request:

https://github.com/apache/spark/pull/2077#issuecomment-89632207
  
Does anyone have a good example of an application which produces multiple 
(different) jobs, that we could use to test this on ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL][SPARK-6632]: Read schema from each ...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5298#issuecomment-89631527
  
Ah, I'm also considering similar optimizations for Spark 1.4 :)

The tricky part here is that, when scanning the Parquet table, Spark needs 
to call `ParquetInputFormat.getSplits` to compute (Spark) partition 
information. This `getSplits` call can be super expensive as it needs to read 
footers of all Parquet part-files to compute the Parquet splits. And that's why 
`ParquetRelation2` caches those footers at the very beginning and inject them 
into an extended Parquet input format. With all these footers cached, 
`ParquetRelation2.readSchma()` is actually quite lightweight. So the real 
bottleneck is reading all those footers.

Fortunately, Parquet is also trying to avoid reading footers entirely at 
the driver side (see https://github.com/apache/incubator-parquet-mr/pull/91 and 
https://github.com/apache/incubator-parquet-mr/pull/45). After upgrading to 
Parquet 1.6, which is expected to be released next week, we can do this 
properly for better performance.

So ideally, we don't read footers on driver side, and when we have a 
central arbitrative schema at hand, either from metastore or data source DDL, 
we don't do schema merging at driver side either. I haven't got time to walk 
through all related Parquet code path and PRs yet, so the above statements may 
be inaccurate. Please correct me if you find any mistakes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core] Replace direct use of Akka ...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5268#issuecomment-89629603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29711/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2883][SQL] Spark Support for ORCFile fo...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5275#issuecomment-89623421
  
@zhzhan I'm right now designing partitioning support for the data sources 
API, and will hopefully make the design doc next week. Will come back to this 
PR after that. With that part at hand, I believe we can further simplify the 
ORC data source.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL][SPARK-6632]: Read schema from each ...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5298#issuecomment-89624702
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SQL][SPARK-6632]: Read schema from each ...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5298#issuecomment-89624832
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29714/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5325] [SQL] Shrink the Hive shim layer

2015-04-04 Thread liancheng
Github user liancheng closed the pull request at:

https://github.com/apache/spark/pull/4107


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5325] [SQL] Shrink the Hive shim layer

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/4107#issuecomment-89621672
  
Yeah agree. Closing this. Though the `callWithAlternatives` utility 
function can be very neat to do simple lightweight reflection tricks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6201] [SQL] promote string and do widen...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/4945#issuecomment-89621194
  
The thing that makes me hesitant here is whether we should stick to Hive, 
because Hive's behavior is actually error prone and unintuitive. In Hive, `IN` 
is implemented as a UDF, and function argument type coercion rules apply here.

Take `"1.00" IN (1.0, 2.0)` as an example, `"1.00"`, `1.0`, and `2.0` are 
all arguments of `GenericUDFIn`. When doing type coercion, `1.0` and `2.0` is 
first converted to string `"1.0"` and `"2.0"`, and then compared with `"1.00"`, 
thus returns false.

Personally I think maybe we should just throw an exception if the left side 
of `IN` has different data types from the right side.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] [WIP] Blacklists several Hive 0.13.1 spe...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/4851#issuecomment-89615036
  
No. With the metastore adapter layer, we can always keep our tests 
consistent with the most recent Hive version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5684][SQL]: Pass in partition name alon...

2015-04-04 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/4469#issuecomment-89613886
  
Hi @marmbrus , this is a pretty common scenario in production, where the 
data is generated in some directory and then later partitions are added to 
tables using alter table  add partition (=value) location 

In the old parquet path in v1.2.1, this is not possible.
This is doable in the new parquet path in spark 1.3 though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6694][SQL]SparkSQL CLI must be able to ...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5345#issuecomment-89611702
  
@adachij2002 Would you mind to add a test case for this in `CliSuite`? We 
can pass `--database ` via `extraArgs` in `runCliWithin` there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Doc] [SQL] Addes Hive metastore Parquet table...

2015-04-04 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/5348#discussion_r27770121
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1034,6 +1034,79 @@ df3.printSchema()
 
 
 
+### Hive metastore Parquet table conversion
+
+When reading from and writing to Hive metastore Parquet tables, Spark SQL 
will try to use its own
+Parquet support instead of Hive SerDe for better performance. This 
behavior is controlled by the
+`spark.sql.hive.convertMetastoreParquet` configuration, and is turned on 
by default.
+
+ Hive/Parquet Schema Reconciliation
+
+There are two key differences between Hive and Parquet from the 
perspective of table schema
+processing.
+
+1. Hive is case insensitive, while Parquet is not
+1. Hive considers all columns nullable, while nullability in Parquet is 
significant
+
+Due to this reason, we must reconcile Hive metastore schema with Parquet 
schema when converting a
+Hive metastore Parquet table to a Spark SQL Parquet table.  The 
reconciliation rules are:
+
+1. Fields that have the same name in both schema must have the same data 
type regardless of
+   nullability.  The reconciled field should have the data type of the 
Parquet side, so that
+   nullability is respected.
+
+1. The reconciled schema contains exactly those fields defined in Hive 
metastore schema.
+
+   - Any fields that only appear in the Parquet schema are dropped in the 
reconciled schema.
+   - Any fileds that only appear in the Hive metastore schema are added as 
nullable field in the
+ reconciled schema.
+
+ Metadata Refreshing
+
+Spark SQL caches Parquet metadata for better performance.  When Hive 
metastore Parquet table
--- End diff --

Agree, missing such a section is part of the reason why I put the metadata 
refreshing section here...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6696] [SQL] Adds HiveContext.refreshTab...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5349#issuecomment-89610598
  
We need a properly configured Hive environment to run the test. I can add a 
simple `TestHive`-like class to do metastore / warehouse configurations though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6607][SQL] Check invalid characters for...

2015-04-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5263


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6607][SQL] Check invalid characters for...

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5263#issuecomment-89608688
  
Thanks for working on this! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6006][SQL]: Optimize count distinct for...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4764#issuecomment-89604947
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29709/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core] Replace direct use of Akka ...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5268#issuecomment-89604923
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29710/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6006][SQL]: Optimize count distinct for...

2015-04-04 Thread saucam
Github user saucam commented on the pull request:

https://github.com/apache/spark/pull/4764#issuecomment-89604768
  
fixed the test case of zero count when there is no data. rebased with 
latest master. please retest


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

2015-04-04 Thread debasish83
Github user debasish83 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3098#discussion_r27769592
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala ---
@@ -167,23 +169,66 @@ object MovieLensALS {
   .setProductBlocks(params.numProductBlocks)
   .run(training)
 
-val rmse = computeRmse(model, test, params.implicitPrefs)
-
-println(s"Test RMSE = $rmse.")
+params.metrics match {
+  case "rmse" =>
+val rmse = computeRmse(model, test, params.implicitPrefs)
+println(s"Test RMSE = $rmse")
+  case "map" =>
+val (map, users) = computeRankingMetrics(model, training, test, 
numMovies.toInt)
+println(s"Test users $users MAP $map")
+  case _ => println(s"Metrics not defined, options are rmse/map")
+}
 
 sc.stop()
   }
 
   /** Compute RMSE (Root Mean Squared Error). */
-  def computeRmse(model: MatrixFactorizationModel, data: RDD[Rating], 
implicitPrefs: Boolean)
-: Double = {
-
-def mapPredictedRating(r: Double) = if (implicitPrefs) 
math.max(math.min(r, 1.0), 0.0) else r
-
+  def computeRmse(
+model: MatrixFactorizationModel,
+data: RDD[Rating],
+implicitPrefs: Boolean) : Double = {
 val predictions: RDD[Rating] = model.predict(data.map(x => (x.user, 
x.product)))
-val predictionsAndRatings = predictions.map{ x =>
-  ((x.user, x.product), mapPredictedRating(x.rating))
+val predictionsAndRatings = predictions.map { x =>
+  ((x.user, x.product), mapPredictedRating(x.rating, implicitPrefs))
 }.join(data.map(x => ((x.user, x.product), x.rating))).values
 math.sqrt(predictionsAndRatings.map(x => (x._1 - x._2) * (x._1 - 
x._2)).mean())
   }
+
+  def mapPredictedRating(r: Double, implicitPrefs: Boolean) = {
+if (implicitPrefs) math.max(math.min(r, 1.0), 0.0) else r
+  }
+  
+  /** Compute MAP (Mean Average Precision) statistics for top N product 
Recommendation */
+  def computeRankingMetrics(
+model: MatrixFactorizationModel,
+train: RDD[Rating],
+test: RDD[Rating],
+n: Int) : (Double, Long) = {
+val ord = Ordering.by[(Int, Double), Double](x => x._2)
+
+val testUserLabels = test.map {
--- End diff --

I will update with topByKeyIs there a better place to move this 
function ? may be inside ALS object for example ? That way I can add a 
test-case to guard it ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Use path.makeQualified in newParquet.

2015-04-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5353


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [ML] SPARK-2426: Integrate Breeze NNLS with ML...

2015-04-04 Thread debasish83
Github user debasish83 commented on the pull request:

https://github.com/apache/spark/pull/5005#issuecomment-89594722
  
@mengxr any insight on it ? the runtime issue is only in first iteration 
and I think you can point out if there is any obvious issue in the way I call 
the solver...looks like something to do with initialization...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Use path.makeQualified in newParquet.

2015-04-04 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5353#issuecomment-89594648
  
LGTM, merging to master and branch-1.3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6262][MLLIB]Implement missing methods f...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5359#issuecomment-89591756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29708/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR-92] Phase 2: implement sum(rdd)

2015-04-04 Thread hqzizania
Github user hqzizania closed the pull request at:

https://github.com/apache/spark/pull/5360


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR-92] Phase 2: implement sum(rdd)

2015-04-04 Thread hqzizania
GitHub user hqzizania opened a pull request:

https://github.com/apache/spark/pull/5360

[SPARKR-92] Phase 2: implement sum(rdd)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hqzizania/spark R3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5360.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5360


commit 7afa4c9d31fc3a7e9676a75ac51e0983708ccb1a
Author: Shivaram Venkataraman 
Date:   2015-03-01T22:44:59Z

Merge pull request #186 from hlin09/funcDep3

[SPARKR-142][SPARKR-196] (Step 2) Replaces getDependencies() with 
cleanClosure to capture UDF closures and serialize them to worker.

commit 6e51c7ff25388bcf05776fa1ee353401b31b9443
Author: Shivaram Venkataraman 
Date:   2015-03-01T23:00:24Z

Fix stderr redirection on executors

commit 8c4deaedc570c2753a2103d59aba20178d9ef777
Author: Shivaram Venkataraman 
Date:   2015-03-01T23:06:29Z

Remove unused function

commit f7caeb84321f04291214f17a7a6606cb3a0ddee8
Author: Davies Liu 
Date:   2015-03-01T23:11:37Z

Update SparkRBackend.scala

commit b457833ea90575fb11840a18ff616f2d94be2aeb
Author: Shivaram Venkataraman 
Date:   2015-03-01T23:15:05Z

Merge pull request #189 from shivaram/stdErrFix

Fix stderr redirection on executors

commit 862f07c337705337ca8719485e6fe301a711bac7
Author: Shivaram Venkataraman 
Date:   2015-03-01T23:20:35Z

Merge pull request #190 from shivaram/SPARKR-79

[SPARKR-79] Remove unused function

commit 773baf064c923d3f44ea8fdbb5d2f36194245040
Author: Zongheng Yang 
Date:   2015-03-02T00:35:23Z

Merge pull request #178 from davies/random

[SPARKR-204] use random port in backend

commit 5c0bb24bd77a6e1ed4474144f14b6458cdd2c157
Author: Felix Cheung 
Date:   2015-03-02T06:20:41Z

Doc updates: build and running on YARN

commit 8caf5bb81b027aa9e0dc4c3e9d95028d7865e0b9
Author: Davies Liu 
Date:   2015-03-02T19:34:10Z

use S4 methods

commit 7dfe27d06baf5bb00e679ea6a1bb7472295307d4
Author: Davies Liu 
Date:   2015-03-02T20:24:19Z

fix cyclic namespace dependency

commit d7b17a428c27aac28d89e1c85f1ba7d9d4b021d2
Author: Davies Liu 
Date:   2015-03-02T21:07:44Z

fix approxCountDistinct

commit acae5272f0d3c6e853d767ec489e64999306db0f
Author: Davies Liu 
Date:   2015-03-02T21:18:46Z

refactor

commit 8ec21af07caea512cc90c66010d3b7b2dc0fc6e3
Author: Davies Liu 
Date:   2015-03-02T21:40:34Z

fix signature

commit 71d66a1f75f846c77a6e0ece4c40c6d5d5019c06
Author: Davies Liu 
Date:   2015-03-02T21:47:44Z

fix first(0

commit e9983566f93304f2f5624613aedadd1e9d9a5069
Author: cafreeman 
Date:   2015-03-02T22:00:29Z

define generic for 'first' in RDD API

commit f585929cc9edabb3098ed4460eac01237a500e6a
Author: cafreeman 
Date:   2015-03-02T22:02:35Z

Fix brackets

commit 1955a09f83a269d84139891bc29b41d0bcb9a1ae
Author: cafreeman 
Date:   2015-03-02T23:50:12Z

return object instead of a list of one object

commit 76cf2e0ded37175550362ea7474dc9f6866b337b
Author: Shivaram Venkataraman 
Date:   2015-03-03T00:02:26Z

Merge pull request #192 from cafreeman/sparkr-sql

define generic for 'first' in RDD API

commit 03402ebdef99be680c4d0c9c475fd08702d3eb9e
Author: Felix Cheung 
Date:   2015-03-03T00:17:17Z

Updates as per feedback on sparkR-submit

commit 1d0f2ae2097f0838d8c079b0bbcf89fe9805509f
Author: Davies Liu 
Date:   2015-03-03T00:42:34Z

Update DataFrame.R

commit f798402e5ae02853f0477369273c478f7090700a
Author: Davies Liu 
Date:   2015-03-03T00:43:01Z

Update column.R

commit 524c122b0b91ccd73a1eddce465a063d76bd3c47
Author: Davies Liu 
Date:   2015-03-03T00:44:47Z

Merge branch 'sparkr-sql' of github.com:amplab-extras/SparkR-pkg into column

commit 8a676b19475fcafbad925b7ee7fe91ea68e3f3a5
Author: Shivaram Venkataraman 
Date:   2015-03-03T00:59:46Z

Merge pull request #188 from davies/column

[SPARKR-189] [SPARKR-190] Column and expression

commit 06cbc2d233e6c0da062d0984e7cb95d3d9a5a1a1
Author: Davies Liu 
Date:   2015-03-03T01:26:14Z

launch R worker by a daemon

commit 3beadcf9d5ea3db893e469407d2723cfbe6687ef
Author: Davies Liu 
Date:   2015-03-03T01:39:06Z

Merge branch 'sparkr-sql' of github.com:amplab-extras/SparkR-pkg into api

Conflicts:
pkg/R/RDD.R

commit e2d144a798f8ef293467ed8a3eb20b6cf77dcb56
Author: Felix Cheung 
Date:   2015-03-03T01:52:10Z

Fixed small typos

commit 98cc97a7c94a61f290207e4a8481ae97203014c7
Author: Davies Liu 
Date:   2015-03-03T02:01:55Z

fix test and docs

commit 39c253d97224d41abeee52ec486aaed57af270eb
Author: Davies Liu 
Date:   2015-03-03T02:05:19Z

Merge branch 'sparkr-sql' of github.com:amplab-extras/SparkR-pkg into group

Conflicts:
pkg/NAMESPACE
pkg/R/DataFrame.R
pkg/R/utils.R
 

[GitHub] spark pull request: Implement missing methods for MultivariateStat...

2015-04-04 Thread Lewuathe
GitHub user Lewuathe opened a pull request:

https://github.com/apache/spark/pull/5359

Implement missing methods for MultivariateStatisticalSummary

Add below methods in pyspark for MultivariateStatisticalSummary
- normL1
- normL2

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Lewuathe/spark SPARK-6262

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5359.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5359


commit cbe439e4703bf5c2550d38b06cc4eada5bef6484
Author: lewuathe 
Date:   2015-04-04T13:34:34Z

Implement missing methods for MultivariateStatisticalSummary




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] [WIP] Tries to skip row groups when read...

2015-04-04 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/5334#discussion_r27768822
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala ---
@@ -226,7 +224,7 @@ private[sql] case class ParquetRelation2(
 private var commonMetadataStatuses: Array[FileStatus] = _
 
 // Parquet footer cache.
-var footers: Map[FileStatus, Footer] = _
+var footers: Map[Path, Footer] = _
--- End diff --

`FileStatus` objects are also cached, so this should be OK. Bounding the 
size can be a good idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-04 Thread dragos
Github user dragos commented on the pull request:

https://github.com/apache/spark/pull/5144#issuecomment-89547152
  
I still want to run this on a local cluster before I say LGTM, but the code 
looks good so far!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-04 Thread dragos
Github user dragos commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r27768214
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/DriverQueue.scala 
---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import scala.collection.mutable
+
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+
+/**
+ * A request queue for launching drivers in Mesos cluster mode.
+ * This queue automatically stores the state after each pop/push
+ * so it can be recovered later.
+ * This queue is also bounded and rejects offers when it's full.
+ * @param state Mesos state abstraction to fetch persistent state.
+ */
+private[mesos] class DriverQueue(state: MesosClusterPersistenceEngine, 
capacity: Int) {
+  var queue: mutable.Queue[MesosDriverDescription] = new 
mutable.Queue[MesosDriverDescription]()
+  private var count = 0
+
+  initialize()
+
+  def initialize(): Unit = {
+state.fetchAll[MesosDriverDescription]().foreach(d => queue.enqueue(d))
+
+// This size might be larger than the passed in capacity, but we allow
+// this so we don't lose queued drivers.
+count = queue.size
+  }
+
+  def isFull = count >= capacity
+
+  def size: Int = count
+
+  def contains(submissionId: String): Boolean = {
+queue.exists(s => s.submissionId.equals(submissionId))
--- End diff --

You are right, I missed the fact that the queue isn't storing 
`submissionId`s directly. Ignore this :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] SPARK-6489: Optimize lateral view with e...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5358#issuecomment-89536841
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [ SQL ] SparkSPARK-6489: Optimize lateral view...

2015-04-04 Thread dreamquster
Github user dreamquster closed the pull request at:

https://github.com/apache/spark/pull/5346


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [ SQL ] SparkSPARK-6489: Optimize lateral view...

2015-04-04 Thread dreamquster
Github user dreamquster commented on the pull request:

https://github.com/apache/spark/pull/5346#issuecomment-89536605
  
ok,I split it into two pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] SPARK-6489: Optimize lateral view with e...

2015-04-04 Thread dreamquster
GitHub user dreamquster opened a pull request:

https://github.com/apache/spark/pull/5358

[SQL] SPARK-6489: Optimize lateral view with explode to not unnecessary 
columns.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dreamquster/spark spark-explode-optimize

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5358.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5358


commit 1b29835b2beaba53f6cec3c02680002ad89802f5
Author: dreamquster 
Date:   2015-04-04T08:42:47Z

[SQL] SPARK-6489: Optimize lateral view with explode to not read 
unnecessary columns

commit 376d332462a3e2a21a28ecdeab14a3bd1f49ffbf
Author: dreamquster 
Date:   2015-04-04T09:14:04Z

[SQL] SPARK-6489: adding test files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6130] [SQL] support if not exists for i...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4865#issuecomment-89527734
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29706/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] SPARK-6548: Adding stddev to DataFrame f...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5357#issuecomment-89527728
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] SPARK-6548: Adding stddev to DataFrame f...

2015-04-04 Thread dreamquster
GitHub user dreamquster opened a pull request:

https://github.com/apache/spark/pull/5357

[SQL] SPARK-6548: Adding stddev to DataFrame functions

remerge SPARK-6548
https://github.com/apache/spark/pull/5228


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dreamquster/spark spark-stddev

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5357.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5357


commit 58dec463b8280b425b5534bdeb28b013ae02eec4
Author: dreamquster 
Date:   2015-04-04T08:13:56Z

[SQL] SPARK-6548: Adding stddev to DataFrame functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6705][MLLIB] Add fit intercept api to m...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5301#issuecomment-89516868
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29707/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6705][MLLIB] Add fit intercept api to m...

2015-04-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/5301#issuecomment-89516426
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6638] [SQL] Improve performance of Stri...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5350#issuecomment-89514738
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29703/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Bug fix for SPARK-5242: "ec2/spark_ec2.py lauc...

2015-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4038#issuecomment-89514490
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29705/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org