[GitHub] spark pull request: [SPARK-6972][SQL] Add Coalesce to DataFrame

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5545#issuecomment-93866737
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30442/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BUILD] Support building with SBT on encrypted...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5546#issuecomment-93877539
  
  [Test build #30452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30452/consoleFull)
 for   PR 5546 at commit 
[`031c602`](https://github.com/apache/spark/commit/031c6025113c064b6fc0b5895b1830f223f6cf55).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4402#issuecomment-93873823
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30450/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6963][CORE]Flaky test: o.a.s.ContextCle...

2015-04-17 Thread witgo
GitHub user witgo opened a pull request:

https://github.com/apache/spark/pull/5548

[SPARK-6963][CORE]Flaky test: o.a.s.ContextCleanerSuite automatically 
cleanup checkpoint

cc @andrewor14

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/witgo/spark SPARK-6963

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5548.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5548


commit b08b3c994902629b54808c27841335fb6ca2715d
Author: GuoQiang Li wi...@qq.com
Date:   2015-04-17T01:56:11Z

Flaky test: o.a.s.ContextCleanerSuite automatically cleanup checkpoint




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6368][SQL] Build a specialized serializ...

2015-04-17 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5497#discussion_r28563536
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -139,6 +141,8 @@ private[sql] class SQLConf extends Serializable {
*/
   private[spark] def codegenEnabled: Boolean = getConf(CODEGEN_ENABLED, 
false).toBoolean
 
+  private[spark] def useSqlSerializer2: Boolean = 
getConf(USE_SQL_SERIALIZER2, false).toBoolean
--- End diff --

Also do we want to turn it on by default?  Its easy to turn off if we find 
bugs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6350][Mesos] Make mesosExecutorCores co...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5063#issuecomment-93882452
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30448/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6957] [SPARK-6958] [SQL] improve API co...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5544#issuecomment-93865283
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30440/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6065] [MLlib] Optimize word2vec.findSyn...

2015-04-17 Thread MechCoder
Github user MechCoder commented on a diff in the pull request:

https://github.com/apache/spark/pull/5467#discussion_r28573221
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -479,9 +492,16 @@ class Word2VecModel private[mllib] (
*/
   def findSynonyms(vector: Vector, num: Int): Array[(String, Double)] = {
 require(num  0, Number of similar words should  0)
-// TODO: optimize top-k
-val fVector = vector.toArray.map(_.toFloat)
-model.mapValues(vec = cosineSimilarity(fVector, vec))
+
+val numWords = wordVectors.numRows
+val cosineVec = Vectors.zeros(numWords).asInstanceOf[DenseVector]
+BLAS.gemv(1.0, wordVectors, vector.asInstanceOf[DenseVector], 0.0, 
cosineVec)
+
+// Need not divide with the norm of the given vector since it is 
constant.
+val updatedCosines = indexedModel.map { case (_, ind) =
--- End diff --

Do you mean that when I do a `dict.map`, the ordering need not be the same 
as that of the dict?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6975][Yarn] Fix argument validation err...

2015-04-17 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/5551

[SPARK-6975][Yarn] Fix argument validation error

`numExecutors` checking is failed when dynamic allocation is enabled with 
default configuration. Details can be seen is 
[SPARK-6975](https://issues.apache.org/jira/browse/SPARK-6975). @sryza, please 
help me to review this, not sure is this the correct way, I think previous you 
change this part :)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-6975

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5551.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5551


commit 77bdcbdc00522e76f9394c68d769f35c15af09a6
Author: jerryshao saisai.s...@intel.com
Date:   2015-04-17T07:08:16Z

Fix argument validation error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6065] [MLlib] Optimize word2vec.findSyn...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5467#issuecomment-93914210
  
  [Test build #30464 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30464/consoleFull)
 for   PR 5467 at commit 
[`64575b0`](https://github.com/apache/spark/commit/64575b0282b350facc93340fbf653b38b0121b1a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6953] [PySpark] speed up python tests

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5427#issuecomment-93897975
  
  [Test build #30458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30458/consoleFull)
 for   PR 5427 at commit 
[`2654bfd`](https://github.com/apache/spark/commit/2654bfda79da9d12c897bc144da2b2137a56c68c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Project Infra] SPARK-1684: Merge script shoul...

2015-04-17 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/5149#discussion_r28569271
  
--- Diff: dev/merge_spark_pr.py ---
@@ -286,68 +281,137 @@ def resolve_jira_issues(title, merge_branches, 
comment):
 resolve_jira_issue(merge_branches, comment, jira_id)
 
 
-branches = get_json(%s/branches % GITHUB_API_BASE)
-branch_names = filter(lambda x: x.startswith(branch-), [x['name'] for x 
in branches])
-# Assumes branch names can be sorted lexicographically
-latest_branch = sorted(branch_names, reverse=True)[0]
-
-pr_num = raw_input(Which pull request would you like to merge? (e.g. 34): 
)
-pr = get_json(%s/pulls/%s % (GITHUB_API_BASE, pr_num))
-pr_events = get_json(%s/issues/%s/events % (GITHUB_API_BASE, pr_num))
-
-url = pr[url]
-title = pr[title]
-body = pr[body]
-target_ref = pr[base][ref]
-user_login = pr[user][login]
-base_ref = pr[head][ref]
-pr_repo_desc = %s/%s % (user_login, base_ref)
-
-# Merged pull requests don't appear as merged in the GitHub API;
-# Instead, they're closed by asfgit.
-merge_commits = \
-[e for e in pr_events if e[actor][login] == asfgit and 
e[event] == closed]
-
-if merge_commits:
-merge_hash = merge_commits[0][commit_id]
-message = get_json(%s/commits/%s % (GITHUB_API_BASE, 
merge_hash))[commit][message]
-
-print Pull request %s has already been merged, assuming you want to 
backport % pr_num
-commit_is_downloaded = run_cmd(['git', 'rev-parse', '--quiet', 
'--verify',
+def standardize_jira_ref(text):
+
+Standardize the [MODULE] SPARK-X prefix
+Converts [SPARK-XXX][mllib] Issue, [MLLib] SPARK-XXX. Issue or 
SPARK XXX [MLLIB]: Issue to [MLLIB] SPARK-XXX: Issue
+
+ standardize_jira_ref([SPARK-5821] [SQL] ParquetRelation2 CTAS 
should check if delete is successful)
+'[SQL] SPARK-5821: ParquetRelation2 CTAS should check if delete is 
successful'
+ standardize_jira_ref([SPARK-4123][Project Infra][WIP]: Show new 
dependencies added in pull requests)
+'[PROJECT INFRA] [WIP] SPARK-4123: Show new dependencies added in pull 
requests'
+ standardize_jira_ref([MLlib] Spark  5954: Top by key)
+'[MLLIB] SPARK-5954: Top by key'
+
+#If the string is compliant, no need to process any further
+if (re.search(r'\[[A-Z0-9_]+\] SPARK-[0-9]{3,5}: \S+', text)):
+return text
+
+# Extract JIRA ref(s):
+jira_refs = deque()
+pattern = re.compile(r'(SPARK[-\s]*[0-9]{3,5})', re.IGNORECASE)
+while (pattern.search(text) is not None):
+ref = pattern.search(text).groups()[0]
+# Replace any whitespace with a dash  convert to uppercase
+jira_refs.append(re.sub(r'\s+', '-', ref.upper()))
+text = text.replace(ref, '')
+
+# Extract spark component(s):
+components = deque()
+# Look for alphanumeric chars, spaces, and/or commas
+pattern = re.compile(r'(\[[\w\s,]+\])', re.IGNORECASE)
+while (pattern.search(text) is not None):
+component = pattern.search(text).groups()[0]
+# Convert to uppercase
+components.append(component.upper())
+text = text.replace(component, '')
+
+# Cleanup remaining symbols:
+pattern = re.compile(r'^\W+(.*)', re.IGNORECASE)
+if (pattern.search(text) is not None):
+text = pattern.search(text).groups()[0]
+
+# Assemble full text (module(s), JIRA ref(s), remaining text)
+if (len(components)  1):
+components = 
+component_text = ' '.join(components).strip()
+if (len(jira_refs)  1):
+jira_ref_text = 
+jira_ref_text = ' '.join(jira_refs).strip()
+
+if (len(jira_ref_text)  1 and len(component_text)  1):
+clean_text = text.strip()
+elif (len(jira_ref_text)  1):
+clean_text = component_text + ' ' + text.strip()
+elif (len(component_text)  1):
+clean_text = jira_ref_text + ': ' + text.strip()
+else:
+clean_text = component_text + ' ' + jira_ref_text + ': ' + 
text.strip()
+
+return clean_text
+
+def main():
+os.chdir(SPARK_HOME)
--- End diff --

Just to be sure since it's a bit tricky with the diff here - all of this is 
simply re-organization, correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: 

[GitHub] spark pull request: [SPARK-6113] [ml] Stabilize DecisionTree API

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5530#issuecomment-93864879
  
  [Test build #30439 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30439/consoleFull)
 for   PR 5530 at commit 
[`6aae255`](https://github.com/apache/spark/commit/6aae25587cdcadc0e5d68078ca77d0cdee59e6e4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class Params(`
  * `sealed abstract class Node extends Serializable `
  * `sealed trait Split extends Serializable `
  * `final class CategoricalSplit(`
  * `final class ContinuousSplit(override val featureIndex: Int, val 
threshold: Double) extends Split `
  * `trait DecisionTreeModel `

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] SPARK-6548: Adding stddev to DataFrame f...

2015-04-17 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/5357#issuecomment-93868139
  
/cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6368][SQL] Build a specialized serializ...

2015-04-17 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5497#discussion_r28563376
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -139,6 +141,8 @@ private[sql] class SQLConf extends Serializable {
*/
   private[spark] def codegenEnabled: Boolean = getConf(CODEGEN_ENABLED, 
false).toBoolean
 
+  private[spark] def useSqlSerializer2: Boolean = 
getConf(USE_SQL_SERIALIZER2, false).toBoolean
--- End diff --

This seems pretty hard as there is no standard interface to the serializer 
constructor.

Perhaps we should document this and say it is experimental?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93900302
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30460/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

2015-04-17 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4435#issuecomment-93898160
  
I'd like to finish reviewing this, but I keep getting pre-empted by other 
work, so instead I'll leave a list of things that I would look at / check when 
reviewing this (to let other folks pick up and finish the review).  This looks 
like it's in pretty good shape overall, though, so hopefully it won't be too 
much work to finish this.

Here's what I'd look at in any final review passes:

- Has the visibility of new classes / methods / interfaces been restricted 
to the narrowest possible scope (i.e. are we unintentionally exposing internal 
functionality)?  If something _has_ to be public but is not intended to be 
stable / available to users, we should add a documentation comment to explain 
this.
- Have accesses to listeners been properly synchronized?
- Are there any code style nits that we should clean up?  I noticed a bunch 
of minor indentation problems, but don't really have time to comment 
individually.
- I'd take a look at how we handle timestamps in JSON, just to double-check 
that we're exposing them in an easy-to-consume format.
- Documentation-wise, are there any confusing parts of the code that need 
to be documented?
- Can we add a top-level Javadoc comment somewhere to explain our overall 
strategy for handling JSON compatibility, etc, and maybe a checklist / rules to 
follow when changing these classes?  There's something similar to this in one 
of the JSONProtocol classes, which might be nice to model this on.

I'd also manually test this in a spark-shell.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/5549

[SPARK-5352][GraphX] Add getPartitionStrategy in Graph 

Graph remembers an applied partition strategy in partitionBy() and returns 
it via getPartitionStrategy().
This is useful in case of the following situation;

val g1 = GraphLoader.edgeListFile(sc, graph.txt)
val g2 = g1.partitionBy(EdgePartition2D, 2)

// Modify (e.g., add, contract, ...) edges in g2
val newEdges = ...

// Re-build a new graph based on g2
val g3 = Graph(g1.vertices, newEdges)

// Partition edges in a similar way of g2
val g4 = g3.partitionBy(g2.getPartitionStrategy, 2)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark PartitionStrategyInGraph

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5549.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5549


commit 084ae5a80c96cb481c2b7d3f5aced99b09619057
Author: Takeshi YAMAMURO linguin@gmail.com
Date:   2015-04-17T04:05:13Z

Add getPartitionStrategy

commit c46d126a044d089f70b1c38b3cdb4979b6ffe589
Author: Takeshi YAMAMURO linguin@gmail.com
Date:   2015-04-17T04:54:38Z

Add an new entry in MimaExlucdes.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-17 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-93917004
  
@viper-kun What's the status of this patch? If you don't make further 
updates, I'd like to brush up this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93900300
  
  [Test build #30460 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30460/consoleFull)
 for   PR 4015 at commit 
[`12249a2`](https://github.com/apache/spark/commit/12249a2ea065effc00c8ad67a3d2f9eef5e8878b).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5549#issuecomment-93893522
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread tnachen
Github user tnachen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r28573550
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -0,0 +1,614 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.io.File
+import java.util.concurrent.locks.ReentrantLock
+import java.util.{Collections, Date, List = JList}
+
+import org.apache.mesos.Protos.Environment.Variable
+import org.apache.mesos.Protos.TaskStatus.Reason
+import org.apache.mesos.Protos.{TaskState = MesosTaskState, _}
+import org.apache.mesos.{Scheduler, SchedulerDriver}
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+import org.apache.spark.deploy.rest.{CreateSubmissionResponse, 
KillSubmissionResponse, SubmissionStatusResponse}
+import org.apache.spark.metrics.MetricsSystem
+import org.apache.spark.util.Utils
+import org.apache.spark.{SecurityManager, SparkConf, SparkException, 
TaskState}
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+
+/**
+ * Tracks the current state of a Mesos Task that runs a Spark driver.
+ * @param submission Submitted driver description from
+ *   [[org.apache.spark.deploy.rest.mesos.MesosRestServer]]
+ * @param taskId Mesos TaskID generated for the task
+ * @param slaveId Slave ID that the task is assigned to
+ * @param taskState The last known task status update.
+ * @param startDate The date the task was launched
+ */
+private[spark] class MesosClusterTaskState(
+val submission: MesosDriverDescription,
+val taskId: TaskID,
+val slaveId: SlaveID,
+var taskState: Option[TaskStatus],
+var startDate: Date)
+  extends Serializable {
+
+  def copy(): MesosClusterTaskState = {
+new MesosClusterTaskState(
+  submission, taskId, slaveId, taskState, startDate)
+  }
+}
+
+/**
+ * Tracks the retry state of a driver, which includes the next time it 
should be scheduled
+ * and necessary information to do exponential backoff.
+ * This class is not thread-safe, and we expect the caller to handle 
synchronizing state.
+ * @param lastFailureStatus Last Task status when it failed.
+ * @param retries Number of times it has retried.
+ * @param nextRetry Next retry time to be scheduled.
+ * @param waitTime The amount of time driver is scheduled to wait until 
next retry.
+ */
+private[spark] class RetryState(
+val lastFailureStatus: TaskStatus,
+val retries: Int,
+val nextRetry: Date,
+val waitTime: Int) extends Serializable {
+  def copy(): RetryState =
+new RetryState(lastFailureStatus, retries, nextRetry, waitTime)
+}
+
+/**
+ * The full state of the cluster scheduler, currently being used for 
displaying
+ * information on the UI.
+ * @param frameworkId Mesos Framework id for the cluster scheduler.
+ * @param masterUrl The Mesos master url
+ * @param queuedDrivers All drivers queued to be launched
+ * @param launchedDrivers All launched or running drivers
+ * @param finishedDrivers All terminated drivers
+ * @param retryList All drivers pending to be retried
+ */
+private[spark] class MesosClusterSchedulerState(
+val frameworkId: String,
+val masterUrl: Option[String],
+val queuedDrivers: Iterable[MesosDriverDescription],
+val launchedDrivers: Iterable[MesosClusterTaskState],
+val finishedDrivers: Iterable[MesosClusterTaskState],
+val retryList: Iterable[MesosDriverDescription])
+
+/**
+ * A Mesos scheduler that is responsible for launching submitted Spark 
drivers in cluster mode
+ * as Mesos tasks in a Mesos cluster.
+ * All drivers are launched asynchronously by the framework, which will 
eventually be launched
+ * by one of the 

[GitHub] spark pull request: [SPARK-6957] [SPARK-6958] [SQL] improve API co...

2015-04-17 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5544#discussion_r28570802
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -999,6 +1017,13 @@ def _to_java_column(col):
 return jcol
 
 
+def _to_seq(sc, cols, converter=None):
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6966][SQL] Use correct ClassLoader for ...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5543#issuecomment-93862506
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30438/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6899][SQL] Fix type mismatch when using...

2015-04-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5517


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6156][CORE]Not cache in memory again wh...

2015-04-17 Thread suyanNone
Github user suyanNone commented on the pull request:

https://github.com/apache/spark/pull/4886#issuecomment-93885511
  
@srowen  
I forgot to update desc, already refine

if program go here `if (!putLevel.useMemory) {`, means put a disk level 
block, or memory_and_disk level block which put in memory is failed and try to 
put in disk(which `putLevel.useMemory` is false, and block `level.useMemory` is 
true).
I not sure is reasonable to put twice in a short time.

```
if (!putLevel.useMemory) {
  /*
   * This RDD is not to be cached in memory, so we can just pass the 
computed values as an
   * iterator directly to the BlockManager rather than first fully 
unrolling it in memory.
   */
  updatedBlocks ++=
blockManager.putIterator(key, values, level, tellMaster = true, 
effectiveStorageLevel)
  blockManager.getLocal(key, !level.useMemory) match {
case Some(v) = v.data.asInstanceOf[Iterator[T]]
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BUILD] Support building with SBT on encrypted...

2015-04-17 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/5546#issuecomment-93872501
  
Talk with @pwendell off-line and seems since we don't publish with SBT this 
is pretty safe.  He asked me to update the docs to make it clear why we don't 
do this for maven (though if someone from the scala side says this is safe, I'd 
argue to do it there too).

@srowen any further objections to merging this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5478#issuecomment-93902319
  
  [Test build #30462 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30462/consoleFull)
 for   PR 5478 at commit 
[`547fd95`](https://github.com/apache/spark/commit/547fd957ba224c86cf828890562b2eafde2b8ecb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5498#issuecomment-93909594
  
  [Test build #30463 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30463/consoleFull)
 for   PR 5498 at commit 
[`22d65ca`](https://github.com/apache/spark/commit/22d65cac9bb22a9cdda5019042acca0c66e46270).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6845] [MLlib] [PySpark] Add isTranposed...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5455#issuecomment-93896772
  
  [Test build #30457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30457/consoleFull)
 for   PR 5455 at commit 
[`151f3b6`](https://github.com/apache/spark/commit/151f3b67dbdd07462b00125c696d987a3cebb6ad).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6806] [SparkR] [Docs] Fill in SparkR ex...

2015-04-17 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5442#issuecomment-93897920
  
@shivaram Should we merge this or wait for API audit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread tnachen
Github user tnachen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r28570218
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -0,0 +1,614 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.io.File
+import java.util.concurrent.locks.ReentrantLock
+import java.util.{Collections, Date, List = JList}
+
+import org.apache.mesos.Protos.Environment.Variable
+import org.apache.mesos.Protos.TaskStatus.Reason
+import org.apache.mesos.Protos.{TaskState = MesosTaskState, _}
+import org.apache.mesos.{Scheduler, SchedulerDriver}
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+import org.apache.spark.deploy.rest.{CreateSubmissionResponse, 
KillSubmissionResponse, SubmissionStatusResponse}
+import org.apache.spark.metrics.MetricsSystem
+import org.apache.spark.util.Utils
+import org.apache.spark.{SecurityManager, SparkConf, SparkException, 
TaskState}
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+
+/**
+ * Tracks the current state of a Mesos Task that runs a Spark driver.
+ * @param submission Submitted driver description from
+ *   [[org.apache.spark.deploy.rest.mesos.MesosRestServer]]
+ * @param taskId Mesos TaskID generated for the task
+ * @param slaveId Slave ID that the task is assigned to
+ * @param taskState The last known task status update.
+ * @param startDate The date the task was launched
+ */
+private[spark] class MesosClusterTaskState(
+val submission: MesosDriverDescription,
+val taskId: TaskID,
+val slaveId: SlaveID,
+var taskState: Option[TaskStatus],
+var startDate: Date)
+  extends Serializable {
+
+  def copy(): MesosClusterTaskState = {
+new MesosClusterTaskState(
+  submission, taskId, slaveId, taskState, startDate)
+  }
+}
+
+/**
+ * Tracks the retry state of a driver, which includes the next time it 
should be scheduled
+ * and necessary information to do exponential backoff.
+ * This class is not thread-safe, and we expect the caller to handle 
synchronizing state.
+ * @param lastFailureStatus Last Task status when it failed.
+ * @param retries Number of times it has retried.
+ * @param nextRetry Next retry time to be scheduled.
+ * @param waitTime The amount of time driver is scheduled to wait until 
next retry.
+ */
+private[spark] class RetryState(
+val lastFailureStatus: TaskStatus,
+val retries: Int,
+val nextRetry: Date,
+val waitTime: Int) extends Serializable {
+  def copy(): RetryState =
+new RetryState(lastFailureStatus, retries, nextRetry, waitTime)
+}
+
+/**
+ * The full state of the cluster scheduler, currently being used for 
displaying
+ * information on the UI.
+ * @param frameworkId Mesos Framework id for the cluster scheduler.
+ * @param masterUrl The Mesos master url
+ * @param queuedDrivers All drivers queued to be launched
+ * @param launchedDrivers All launched or running drivers
+ * @param finishedDrivers All terminated drivers
+ * @param retryList All drivers pending to be retried
+ */
+private[spark] class MesosClusterSchedulerState(
+val frameworkId: String,
+val masterUrl: Option[String],
+val queuedDrivers: Iterable[MesosDriverDescription],
+val launchedDrivers: Iterable[MesosClusterTaskState],
+val finishedDrivers: Iterable[MesosClusterTaskState],
+val retryList: Iterable[MesosDriverDescription])
+
+/**
+ * A Mesos scheduler that is responsible for launching submitted Spark 
drivers in cluster mode
+ * as Mesos tasks in a Mesos cluster.
+ * All drivers are launched asynchronously by the framework, which will 
eventually be launched
+ * by one of the 

[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread tnachen
Github user tnachen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r28573523
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -0,0 +1,614 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.io.File
+import java.util.concurrent.locks.ReentrantLock
+import java.util.{Collections, Date, List = JList}
+
+import org.apache.mesos.Protos.Environment.Variable
+import org.apache.mesos.Protos.TaskStatus.Reason
+import org.apache.mesos.Protos.{TaskState = MesosTaskState, _}
+import org.apache.mesos.{Scheduler, SchedulerDriver}
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+import org.apache.spark.deploy.rest.{CreateSubmissionResponse, 
KillSubmissionResponse, SubmissionStatusResponse}
+import org.apache.spark.metrics.MetricsSystem
+import org.apache.spark.util.Utils
+import org.apache.spark.{SecurityManager, SparkConf, SparkException, 
TaskState}
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+
+/**
+ * Tracks the current state of a Mesos Task that runs a Spark driver.
+ * @param submission Submitted driver description from
+ *   [[org.apache.spark.deploy.rest.mesos.MesosRestServer]]
+ * @param taskId Mesos TaskID generated for the task
+ * @param slaveId Slave ID that the task is assigned to
+ * @param taskState The last known task status update.
+ * @param startDate The date the task was launched
+ */
+private[spark] class MesosClusterTaskState(
+val submission: MesosDriverDescription,
+val taskId: TaskID,
+val slaveId: SlaveID,
+var taskState: Option[TaskStatus],
+var startDate: Date)
+  extends Serializable {
+
+  def copy(): MesosClusterTaskState = {
+new MesosClusterTaskState(
+  submission, taskId, slaveId, taskState, startDate)
+  }
+}
+
+/**
+ * Tracks the retry state of a driver, which includes the next time it 
should be scheduled
+ * and necessary information to do exponential backoff.
+ * This class is not thread-safe, and we expect the caller to handle 
synchronizing state.
+ * @param lastFailureStatus Last Task status when it failed.
+ * @param retries Number of times it has retried.
+ * @param nextRetry Next retry time to be scheduled.
+ * @param waitTime The amount of time driver is scheduled to wait until 
next retry.
+ */
+private[spark] class RetryState(
+val lastFailureStatus: TaskStatus,
+val retries: Int,
+val nextRetry: Date,
+val waitTime: Int) extends Serializable {
+  def copy(): RetryState =
+new RetryState(lastFailureStatus, retries, nextRetry, waitTime)
+}
+
+/**
+ * The full state of the cluster scheduler, currently being used for 
displaying
+ * information on the UI.
+ * @param frameworkId Mesos Framework id for the cluster scheduler.
+ * @param masterUrl The Mesos master url
+ * @param queuedDrivers All drivers queued to be launched
+ * @param launchedDrivers All launched or running drivers
+ * @param finishedDrivers All terminated drivers
+ * @param retryList All drivers pending to be retried
+ */
+private[spark] class MesosClusterSchedulerState(
+val frameworkId: String,
+val masterUrl: Option[String],
+val queuedDrivers: Iterable[MesosDriverDescription],
+val launchedDrivers: Iterable[MesosClusterTaskState],
+val finishedDrivers: Iterable[MesosClusterTaskState],
+val retryList: Iterable[MesosDriverDescription])
+
+/**
+ * A Mesos scheduler that is responsible for launching submitted Spark 
drivers in cluster mode
+ * as Mesos tasks in a Mesos cluster.
+ * All drivers are launched asynchronously by the framework, which will 
eventually be launched
+ * by one of the 

[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4402#issuecomment-93892862
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30454/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93922737
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30461/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6845] [MLlib] [PySpark] Add isTranposed...

2015-04-17 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/5455#issuecomment-93903354
  
@mengxr It would be really helpful if you could guide me on my two 
questions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93899838
  
  [Test build #30460 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30460/consoleFull)
 for   PR 4015 at commit 
[`12249a2`](https://github.com/apache/spark/commit/12249a2ea065effc00c8ad67a3d2f9eef5e8878b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6604][PySpark]Specify ip of python serv...

2015-04-17 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request:

https://github.com/apache/spark/pull/5256#issuecomment-93915251
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6806] [SparkR] [Docs] Fill in SparkR ex...

2015-04-17 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5442#discussion_r28571272
  
--- Diff: docs/programming-guide.md ---
@@ -576,6 +660,34 @@ before the `reduce`, which would cause `lineLengths` 
to be saved in memory after
 
 /div
 
+div data-lang=r markdown=1
+
+To illustrate RDD basics, consider the simple program below:
+
+{% highlight r %}
+lines - textFile(sc, data.txt)
+lineLengths - map(lines, length)
+totalLength - reduce(lineLengths, +)
+{% endhighlight %}
+
+The first line defines a base RDD from an external file. This dataset is 
not loaded in memory or
+otherwise acted on: `lines` is merely a pointer to the file.
+The second line defines `lineLengths` as the result of a `map` 
transformation. Again, `lineLengths`
+is *not* immediately computed, due to laziness.
+Finally, we run `reduce`, which is an action. At this point Spark breaks 
the computation into tasks
+to run on separate machines, and each machine runs both its part of the 
map and a local reduction,
+returning only its answer to the driver program.
+
+If we also wanted to use `lineLengths` again later, we could add:
+
+{% highlight r %}
+persist(lineLengths)
--- End diff --

Added a default value for `newLevel` of `persist`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4402#issuecomment-93892854
  
  [Test build #30454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30454/consoleFull)
 for   PR 4402 at commit 
[`182b39b`](https://github.com/apache/spark/commit/182b39bb6818c168fbc23d07f653d4af0ced3cd8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-17 Thread XuTingjun
GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/5550

[SPARK-6973]modify total stages/tasks on the allJobsPage

Though totalStages = allStages - skippedStages is understandable. But 
consider the problem [SPARK-6973], I think totalStages = allStages is more 
reasonable. Like 2/1 (2 failed) (1 skipped), this item also shows the skipped 
num, it also will be understandable.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark allJobsPage

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5550.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5550


commit 47525c6138597a01a6cd2408b95b0fdd4387e0c5
Author: Xu Tingjun xuting...@huawei.com
Date:   2015-04-17T06:29:41Z

modify total stages/tasks on the allJobsPage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/4138#issuecomment-93893277
  
Sorry but mistook to close, so re-make the PR.
https://github.com/apache/spark/pull/5549


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6845] [MLlib] [PySpark] Add isTranposed...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5455#issuecomment-93911463
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30457/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6957] [SPARK-6958] [SQL] improve API co...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5544#issuecomment-93896816
  
  [Test build #30456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30456/consoleFull)
 for   PR 5544 at commit 
[`4944058`](https://github.com/apache/spark/commit/49440583911ccef250e96761de40d3d1605f28c9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93901819
  
  [Test build #30461 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30461/consoleFull)
 for   PR 4015 at commit 
[`8117e14`](https://github.com/apache/spark/commit/8117e1438c1e771a16418ee655a7b0dbb891d1c9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93934762
  
  [Test build #30467 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30467/consoleFull)
 for   PR 4015 at commit 
[`eb026cd`](https://github.com/apache/spark/commit/eb026cd589f6b5a75544ae6130f19dfc7903ea66).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6418] Add simple per-stage visualizatio...

2015-04-17 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/5547#issuecomment-93864988
  
cc @andrewor14 @pwendell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BUILD] Support building with SBT on encrypted...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5546#issuecomment-93887188
  
  [Test build #30452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30452/consoleFull)
 for   PR 5546 at commit 
[`031c602`](https://github.com/apache/spark/commit/031c6025113c064b6fc0b5895b1830f223f6cf55).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4402#issuecomment-93873653
  
  [Test build #30450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30450/consoleFull)
 for   PR 4402 at commit 
[`5810ff2`](https://github.com/apache/spark/commit/5810ff27aa16c183c4cb142f5c75d49f1e755e50).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6350][Mesos] Make mesosExecutorCores co...

2015-04-17 Thread jongyoul
Github user jongyoul commented on the pull request:

https://github.com/apache/spark/pull/5063#issuecomment-93898567
  
@andrewor14 I've fixed what you issue. Please review and merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

2015-04-17 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4435#issuecomment-93889612
  
Hey @squito it looks like the automated dependency checking isn't working 
so well for this PR. Can you do a diff and list all of the dependencies this is 
adding to or updating in Spark? Creating conflicts with user applications seems 
like a concern with this patch. Right now the patch shades the asm dependency, 
is there any reason to shade that one in particular and not others?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6807] [SparkR] Merge recent SparkR-pkg ...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5436#issuecomment-93918020
  
  [Test build #30465 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30465/consoleFull)
 for   PR 5436 at commit 
[`c2b09be`](https://github.com/apache/spark/commit/c2b09be4a465a85ad4d362e9def8139e6b16a05f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5498#issuecomment-93909599
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30463/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-04-17 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/4402#issuecomment-93892969
  
ok, fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6845] [MLlib] [PySpark] Add isTranposed...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5455#issuecomment-93911425
  
  [Test build #30457 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30457/consoleFull)
 for   PR 5455 at commit 
[`151f3b6`](https://github.com/apache/spark/commit/151f3b67dbdd07462b00125c696d987a3cebb6ad).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4138#issuecomment-93902330
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30455/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-93913097
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread tnachen
Github user tnachen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r28570148
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -0,0 +1,614 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.io.File
+import java.util.concurrent.locks.ReentrantLock
+import java.util.{Collections, Date, List = JList}
+
+import org.apache.mesos.Protos.Environment.Variable
+import org.apache.mesos.Protos.TaskStatus.Reason
+import org.apache.mesos.Protos.{TaskState = MesosTaskState, _}
+import org.apache.mesos.{Scheduler, SchedulerDriver}
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+import org.apache.spark.deploy.rest.{CreateSubmissionResponse, 
KillSubmissionResponse, SubmissionStatusResponse}
+import org.apache.spark.metrics.MetricsSystem
+import org.apache.spark.util.Utils
+import org.apache.spark.{SecurityManager, SparkConf, SparkException, 
TaskState}
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+
+/**
+ * Tracks the current state of a Mesos Task that runs a Spark driver.
+ * @param submission Submitted driver description from
+ *   [[org.apache.spark.deploy.rest.mesos.MesosRestServer]]
+ * @param taskId Mesos TaskID generated for the task
+ * @param slaveId Slave ID that the task is assigned to
+ * @param taskState The last known task status update.
+ * @param startDate The date the task was launched
+ */
+private[spark] class MesosClusterTaskState(
+val submission: MesosDriverDescription,
+val taskId: TaskID,
+val slaveId: SlaveID,
+var taskState: Option[TaskStatus],
+var startDate: Date)
+  extends Serializable {
+
+  def copy(): MesosClusterTaskState = {
+new MesosClusterTaskState(
+  submission, taskId, slaveId, taskState, startDate)
+  }
+}
+
+/**
+ * Tracks the retry state of a driver, which includes the next time it 
should be scheduled
+ * and necessary information to do exponential backoff.
+ * This class is not thread-safe, and we expect the caller to handle 
synchronizing state.
+ * @param lastFailureStatus Last Task status when it failed.
+ * @param retries Number of times it has retried.
+ * @param nextRetry Next retry time to be scheduled.
+ * @param waitTime The amount of time driver is scheduled to wait until 
next retry.
+ */
+private[spark] class RetryState(
+val lastFailureStatus: TaskStatus,
+val retries: Int,
+val nextRetry: Date,
+val waitTime: Int) extends Serializable {
+  def copy(): RetryState =
+new RetryState(lastFailureStatus, retries, nextRetry, waitTime)
+}
+
+/**
+ * The full state of the cluster scheduler, currently being used for 
displaying
+ * information on the UI.
+ * @param frameworkId Mesos Framework id for the cluster scheduler.
+ * @param masterUrl The Mesos master url
+ * @param queuedDrivers All drivers queued to be launched
+ * @param launchedDrivers All launched or running drivers
+ * @param finishedDrivers All terminated drivers
+ * @param retryList All drivers pending to be retried
+ */
+private[spark] class MesosClusterSchedulerState(
+val frameworkId: String,
+val masterUrl: Option[String],
+val queuedDrivers: Iterable[MesosDriverDescription],
+val launchedDrivers: Iterable[MesosClusterTaskState],
+val finishedDrivers: Iterable[MesosClusterTaskState],
+val retryList: Iterable[MesosDriverDescription])
+
+/**
+ * A Mesos scheduler that is responsible for launching submitted Spark 
drivers in cluster mode
+ * as Mesos tasks in a Mesos cluster.
+ * All drivers are launched asynchronously by the framework, which will 
eventually be launched
+ * by one of the 

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93899836
  
  [Test build #30459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30459/consoleFull)
 for   PR 5538 at commit 
[`fad020e`](https://github.com/apache/spark/commit/fad020ebc9a1bd1a98a8c758d770d947205e89b1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6604][PySpark]Specify ip of python serv...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5256#issuecomment-93917056
  
  [Test build #688 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/688/consoleFull)
 for   PR 5256 at commit 
[`7b3c633`](https://github.com/apache/spark/commit/7b3c6338db700ad6ba52b53d163dae69db6bd326).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6975][Yarn] Fix argument validation err...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5551#issuecomment-93928521
  
  [Test build #30466 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30466/consoleFull)
 for   PR 5551 at commit 
[`77bdcbd`](https://github.com/apache/spark/commit/77bdcbdc00522e76f9394c68d769f35c15af09a6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5478#issuecomment-93930774
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30462/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6953] [PySpark] speed up python tests

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5427#issuecomment-93930521
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30458/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Project Infra] SPARK-1684: Merge script shoul...

2015-04-17 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/5149#issuecomment-93891216
  
Hey @texasmichelle thanks for contributing this. It slipped of my radar but 
it will be nice to get something like this in. One thing though, even though I 
originally intended the format to be SPARK XXX, in practice, pretty much 
every contributor now puts brackets around that part /cc @srowen. So it has now 
sort of become the de-facto standard!

We should probably update this page to simply tell people to put brackets:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

So I think what we really want now is to coerce the presence of brackets 
rather than remove it! If you look at some recent titles, a few of them have 
this problem.
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog

Sorry for some delay in reviewing this, I can address any updates promptly 
in the next week. Maybe we can start with that pretty simple rule, and then we 
can expand in subsequent patches to do fancier stuff.

The broad organization here looks good.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6953] [PySpark] speed up python tests

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5427#issuecomment-93930478
  
  [Test build #30458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30458/consoleFull)
 for   PR 5427 at commit 
[`2654bfd`](https://github.com/apache/spark/commit/2654bfda79da9d12c897bc144da2b2137a56c68c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5478#issuecomment-93930705
  
  [Test build #30462 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30462/consoleFull)
 for   PR 5478 at commit 
[`547fd95`](https://github.com/apache/spark/commit/547fd957ba224c86cf828890562b2eafde2b8ecb).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6957] [SPARK-6958] [SQL] improve API co...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5544#issuecomment-93931726
  
  [Test build #30456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30456/consoleFull)
 for   PR 5544 at commit 
[`4944058`](https://github.com/apache/spark/commit/49440583911ccef250e96761de40d3d1605f28c9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6957] [SPARK-6958] [SQL] improve API co...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5544#issuecomment-93931780
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30456/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6418] Add simple per-stage visualizatio...

2015-04-17 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/5547#issuecomment-93865113
  
Jenkins, this is ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread tnachen
Github user tnachen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r28573613
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -0,0 +1,614 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.io.File
+import java.util.concurrent.locks.ReentrantLock
+import java.util.{Collections, Date, List = JList}
+
+import org.apache.mesos.Protos.Environment.Variable
+import org.apache.mesos.Protos.TaskStatus.Reason
+import org.apache.mesos.Protos.{TaskState = MesosTaskState, _}
+import org.apache.mesos.{Scheduler, SchedulerDriver}
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+import org.apache.spark.deploy.rest.{CreateSubmissionResponse, 
KillSubmissionResponse, SubmissionStatusResponse}
+import org.apache.spark.metrics.MetricsSystem
+import org.apache.spark.util.Utils
+import org.apache.spark.{SecurityManager, SparkConf, SparkException, 
TaskState}
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+
+/**
+ * Tracks the current state of a Mesos Task that runs a Spark driver.
+ * @param submission Submitted driver description from
+ *   [[org.apache.spark.deploy.rest.mesos.MesosRestServer]]
+ * @param taskId Mesos TaskID generated for the task
+ * @param slaveId Slave ID that the task is assigned to
+ * @param taskState The last known task status update.
+ * @param startDate The date the task was launched
+ */
+private[spark] class MesosClusterTaskState(
+val submission: MesosDriverDescription,
+val taskId: TaskID,
+val slaveId: SlaveID,
+var taskState: Option[TaskStatus],
+var startDate: Date)
+  extends Serializable {
+
+  def copy(): MesosClusterTaskState = {
+new MesosClusterTaskState(
+  submission, taskId, slaveId, taskState, startDate)
+  }
+}
+
+/**
+ * Tracks the retry state of a driver, which includes the next time it 
should be scheduled
+ * and necessary information to do exponential backoff.
+ * This class is not thread-safe, and we expect the caller to handle 
synchronizing state.
+ * @param lastFailureStatus Last Task status when it failed.
+ * @param retries Number of times it has retried.
+ * @param nextRetry Next retry time to be scheduled.
+ * @param waitTime The amount of time driver is scheduled to wait until 
next retry.
+ */
+private[spark] class RetryState(
+val lastFailureStatus: TaskStatus,
+val retries: Int,
+val nextRetry: Date,
+val waitTime: Int) extends Serializable {
+  def copy(): RetryState =
+new RetryState(lastFailureStatus, retries, nextRetry, waitTime)
+}
+
+/**
+ * The full state of the cluster scheduler, currently being used for 
displaying
+ * information on the UI.
+ * @param frameworkId Mesos Framework id for the cluster scheduler.
+ * @param masterUrl The Mesos master url
+ * @param queuedDrivers All drivers queued to be launched
+ * @param launchedDrivers All launched or running drivers
+ * @param finishedDrivers All terminated drivers
+ * @param retryList All drivers pending to be retried
+ */
+private[spark] class MesosClusterSchedulerState(
+val frameworkId: String,
+val masterUrl: Option[String],
+val queuedDrivers: Iterable[MesosDriverDescription],
+val launchedDrivers: Iterable[MesosClusterTaskState],
+val finishedDrivers: Iterable[MesosClusterTaskState],
+val retryList: Iterable[MesosDriverDescription])
+
+/**
+ * A Mesos scheduler that is responsible for launching submitted Spark 
drivers in cluster mode
+ * as Mesos tasks in a Mesos cluster.
+ * All drivers are launched asynchronously by the framework, which will 
eventually be launched
+ * by one of the 

[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4138#issuecomment-93902325
  
**[Test build #30455 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30455/consoleFull)**
 for PR 4138 at commit 
[`f72c058`](https://github.com/apache/spark/commit/f72c05811d89c08fe9f189e9866a1b7bce19d554)
 after a configured wait of `150m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5498#issuecomment-93909418
  
  [Test build #30463 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30463/consoleFull)
 for   PR 5498 at commit 
[`22d65ca`](https://github.com/apache/spark/commit/22d65cac9bb22a9cdda5019042acca0c66e46270).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6807] [SparkR] Merge recent SparkR-pkg ...

2015-04-17 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5436#issuecomment-93917450
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5213] [SQL] Pluggable SQL Parser Suppor...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4015#issuecomment-93922706
  
  [Test build #30461 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30461/consoleFull)
 for   PR 4015 at commit 
[`8117e14`](https://github.com/apache/spark/commit/8117e1438c1e771a16418ee655a7b0dbb891d1c9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class Dialect `

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5563][mllib] online lda initial checkin

2015-04-17 Thread hhbyyh
Github user hhbyyh commented on the pull request:

https://github.com/apache/spark/pull/4419#issuecomment-93939245
  
@jkbradley Provide some update on Correctness test. 
I have tested current PR with https://github.com/Blei-Lab/onlineldavb and 
the result are identical. I've uploaded the result and code to 
https://github.com/hhbyyh/LDACrossValidation. 

I made some change to get rid of randomness, like initialize matrix with 
fixed numbers from file and replace batch sample with even split.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6975][Yarn] Fix argument validation err...

2015-04-17 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5551#discussion_r28577855
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -103,9 +103,14 @@ private[spark] class ClientArguments(args: 
Array[String], sparkConf: SparkConf)
* This is intended to be called only after the provided arguments have 
been parsed.
*/
   private def validateArgs(): Unit = {
-if (numExecutors = 0) {
+if (numExecutors  0 || (!isDynamicAllocationEnabled  numExecutors 
== 0)) {
   throw new IllegalArgumentException(
-You must specify at least 1 executor!\n + getUsageMessage())
+s
+   |Number of executors $numExecutors is not legal.
+   |If dynamic allocation is enable, number of executors should at 
least be 0.
--- End diff --

OK, I will change the statement.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-17 Thread XuTingjun
Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-93946383
  
Yeah, there will be this result. But consider the bug described in the 
jira, I think it's more reasonable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6955][NETWORK]Do not let Yarn Shuffle S...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5537#issuecomment-93949870
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30473/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6955][NETWORK]Do not let Yarn Shuffle S...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5537#issuecomment-93949861
  
  [Test build #30473 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30473/consoleFull)
 for   PR 5537 at commit 
[`962770c`](https://github.com/apache/spark/commit/962770c914a1a1928dccbf14a26df735ba4f77f3).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6635][SQL] DataFrame.withColumn should ...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5541#issuecomment-93956221
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30469/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6635][SQL] DataFrame.withColumn should ...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5541#issuecomment-93956208
  
  [Test build #30469 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30469/consoleFull)
 for   PR 5541 at commit 
[`b539c7b`](https://github.com/apache/spark/commit/b539c7b7aa55c095163d06bac525d1bb90c0b734).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6046] [core] Reorganize deprecated conf...

2015-04-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5514


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5817] [SQL] Fix bug of udtf with column...

2015-04-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4602#discussion_r28575468
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -107,6 +113,12 @@ trait CheckAnalysis {
 failAnalysis(
   sunresolved operator ${operator.simpleString})
 
+  case p @ Project(exprs, _) if containsMultipleGenerators(exprs) 
=
+failAnalysis(
+  sOnly a single table generating function is allowed in a 
SELECT clause, found:
+ | 
${exprs.map(_.prettyString).mkString(,)}.stripMargin)
--- End diff --

Yea, I added in the unit test. see `HiveQuerySuite.scala`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5817] [SQL] Fix bug of udtf with column...

2015-04-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4602#discussion_r28575507
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -473,10 +473,47 @@ class Analyzer(
*/
   object ImplicitGenerate extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan transform {
-  case Project(Seq(Alias(g: Generator, _)), child) =
-Generate(g, join = false, outer = false, None, child)
+  case Project(Seq(Alias(g: Generator, name)), child) =
+Generate(g, join = false, outer = false, child, qualifier = None, 
name :: Nil, Nil)
+  case Project(Seq(MultiAlias(g: Generator, names)), child) =
+Generate(g, join = false, outer = false, child, qualifier = None, 
names, Nil)
 }
   }
+
+  object ResolveGenerate extends Rule[LogicalPlan] {
+// Construct the output attributes for the generator,
+// The output attribute names can be either specified or
+// auto generated.
+private def makeGeneratorOutput(
+generator: Generator,
+attributeNames: Seq[String],
+qualifier: Option[String]): Array[Attribute] = {
+  val elementTypes = generator.elementTypes
+
+  val raw = if (attributeNames.size == elementTypes.size) {
--- End diff --

Hive does exactly the same as you listed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6635][SQL] DataFrame.withColumn should ...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5541#issuecomment-93939754
  
  [Test build #30469 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30469/consoleFull)
 for   PR 5541 at commit 
[`b539c7b`](https://github.com/apache/spark/commit/b539c7b7aa55c095163d06bac525d1bb90c0b734).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5144#issuecomment-93942077
  
  [Test build #30470 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30470/consoleFull)
 for   PR 5144 at commit 
[`61e5dba`](https://github.com/apache/spark/commit/61e5dbabc0e4ef1c1bd80c838991e15bc1e40f4e).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5144#issuecomment-93942082
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30470/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6975][Yarn] Fix argument validation err...

2015-04-17 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5551#discussion_r28577520
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -103,9 +103,14 @@ private[spark] class ClientArguments(args: 
Array[String], sparkConf: SparkConf)
* This is intended to be called only after the provided arguments have 
been parsed.
*/
   private def validateArgs(): Unit = {
-if (numExecutors = 0) {
+if (numExecutors  0 || (!isDynamicAllocationEnabled  numExecutors 
== 0)) {
   throw new IllegalArgumentException(
-You must specify at least 1 executor!\n + getUsageMessage())
+s
+   |Number of executors $numExecutors is not legal.
+   |If dynamic allocation is enable, number of executors should at 
least be 0.
--- End diff --

enabled - enabled. I think this is simpler to state as Number of 
executors was $numExecutors, but must be at least 1 (or 0 if dynamic executor 
allocation is enabled).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6955][NETWORK]Do not let Yarn Shuffle S...

2015-04-17 Thread SaintBacchus
Github user SaintBacchus commented on the pull request:

https://github.com/apache/spark/pull/5537#issuecomment-93947685
  
@andrewor14 `TransportServer#bindRightPort` will use in `Netty` network, in 
that case have a retry mechanism is a better way.
@vanzin  I have clone the configuration and modify the javadoc and also set 
no retry in `StandaloneWorkerShuffleService`. Please have a check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6807] [SparkR] Merge recent SparkR-pkg ...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5436#issuecomment-93949224
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30465/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6604][PySpark]Specify ip of python serv...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5256#issuecomment-93949537
  
  [Test build #30474 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30474/consoleFull)
 for   PR 5256 at commit 
[`7b3c633`](https://github.com/apache/spark/commit/7b3c6338db700ad6ba52b53d163dae69db6bd326).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] There are three tests of sql are failed ...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5552#issuecomment-93949334
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6807] [SparkR] Merge recent SparkR-pkg ...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5436#issuecomment-93949211
  
  [Test build #30465 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30465/consoleFull)
 for   PR 5436 at commit 
[`c2b09be`](https://github.com/apache/spark/commit/c2b09be4a465a85ad4d362e9def8139e6b16a05f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6976][SQL] drop table if exists src p...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5553#issuecomment-93950877
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6976][SQL] drop table if exists src p...

2015-04-17 Thread DoingDone9
GitHub user DoingDone9 opened a pull request:

https://github.com/apache/spark/pull/5553

[SPARK-6976][SQL] drop table if exists src print ERROR info that should 
not be printed when src not exists.

If table src not exists and run sql drop table if exists src, then some 
ERROR info will be printed, like that
```
15/04/17 17:09:53 ERROR Hive: NoSuchObjectException(message:default.src 
table not found)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at $Proxy10.get_table(Unknown Source)
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DoingDone9/spark drop_table_exists

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5553.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5553


commit c3f046f8de7c418d4aa7e74afea9968a8baf9231
Author: DoingDone9 799203...@qq.com
Date:   2015-03-02T02:11:18Z

Merge pull request #1 from apache/master

merge lastest spark

commit cb1852d14f62adbd194b1edda4ec639ba942a8ba
Author: DoingDone9 799203...@qq.com
Date:   2015-03-05T07:05:10Z

Merge pull request #2 from apache/master

merge lastest spark

commit c87e8b6d8cb433376a7d14778915006c31f6c01c
Author: DoingDone9 799203...@qq.com
Date:   2015-03-10T07:46:12Z

Merge pull request #3 from apache/master

merge lastest spark

commit 161cae3a29951d793ce721f9904888bd9529de72
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T06:46:28Z

Merge pull request #4 from apache/master

merge lastest spark

commit 98b134f39ca57f11a5b761c7b9e5f8a7477bd069
Author: DoingDone9 799203...@qq.com
Date:   2015-03-19T09:00:07Z

Merge pull request #5 from apache/master

merge lastest spark

commit d00303b7af9436b9bd6d6d27d411a5c8a2e2294d
Author: DoingDone9 799203...@qq.com
Date:   2015-03-24T08:43:44Z

Merge pull request #6 from apache/master

merge lastest spark

commit 802261c043f56bd5ebe9e46b15e33cdc7c212176
Author: DoingDone9 799203...@qq.com
Date:   2015-03-26T02:21:24Z

Merge pull request #7 from apache/master

merge lastest spark

commit 34b1a9a8a30f689b41fd52b8a10c08666c2ff2b5
Author: Zhongshuai Pei 799203...@qq.com
Date:   2015-04-08T07:55:24Z

Merge pull request #8 from apache/master

merge lastest spark

commit f61210c03f693a266969e06c52c23ccd1bfe3e1b
Author: Zhongshuai Pei 799203...@qq.com
Date:   2015-04-17T09:10:48Z

Merge pull request #9 from apache/master

merge lastest spark

commit c783d02f5fdc44b894d4e8010d3c26c4cde7850c
Author: Zhongshuai Pei 799203...@qq.com
Date:   2015-04-17T09:13:44Z

Update HiveMetastoreCatalog.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6065] [MLlib] Optimize word2vec.findSyn...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5467#issuecomment-93951771
  
  [Test build #30477 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30477/consoleFull)
 for   PR 5467 at commit 
[`da1642d`](https://github.com/apache/spark/commit/da1642deb67dde65bb55b08ae47bd5ce0d29d545).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5338][MESOS] Add cluster mode support f...

2015-04-17 Thread tnachen
Github user tnachen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5144#discussion_r28575245
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -0,0 +1,614 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.io.File
+import java.util.concurrent.locks.ReentrantLock
+import java.util.{Collections, Date, List = JList}
+
+import org.apache.mesos.Protos.Environment.Variable
+import org.apache.mesos.Protos.TaskStatus.Reason
+import org.apache.mesos.Protos.{TaskState = MesosTaskState, _}
+import org.apache.mesos.{Scheduler, SchedulerDriver}
+import org.apache.spark.deploy.mesos.MesosDriverDescription
+import org.apache.spark.deploy.rest.{CreateSubmissionResponse, 
KillSubmissionResponse, SubmissionStatusResponse}
+import org.apache.spark.metrics.MetricsSystem
+import org.apache.spark.util.Utils
+import org.apache.spark.{SecurityManager, SparkConf, SparkException, 
TaskState}
+
+import scala.collection.JavaConversions._
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+
+/**
+ * Tracks the current state of a Mesos Task that runs a Spark driver.
+ * @param submission Submitted driver description from
+ *   [[org.apache.spark.deploy.rest.mesos.MesosRestServer]]
+ * @param taskId Mesos TaskID generated for the task
+ * @param slaveId Slave ID that the task is assigned to
+ * @param taskState The last known task status update.
+ * @param startDate The date the task was launched
+ */
+private[spark] class MesosClusterTaskState(
+val submission: MesosDriverDescription,
+val taskId: TaskID,
+val slaveId: SlaveID,
+var taskState: Option[TaskStatus],
+var startDate: Date)
+  extends Serializable {
+
+  def copy(): MesosClusterTaskState = {
+new MesosClusterTaskState(
+  submission, taskId, slaveId, taskState, startDate)
+  }
+}
+
+/**
+ * Tracks the retry state of a driver, which includes the next time it 
should be scheduled
+ * and necessary information to do exponential backoff.
+ * This class is not thread-safe, and we expect the caller to handle 
synchronizing state.
+ * @param lastFailureStatus Last Task status when it failed.
+ * @param retries Number of times it has retried.
+ * @param nextRetry Next retry time to be scheduled.
+ * @param waitTime The amount of time driver is scheduled to wait until 
next retry.
+ */
+private[spark] class RetryState(
+val lastFailureStatus: TaskStatus,
+val retries: Int,
+val nextRetry: Date,
+val waitTime: Int) extends Serializable {
+  def copy(): RetryState =
+new RetryState(lastFailureStatus, retries, nextRetry, waitTime)
+}
+
+/**
+ * The full state of the cluster scheduler, currently being used for 
displaying
+ * information on the UI.
+ * @param frameworkId Mesos Framework id for the cluster scheduler.
+ * @param masterUrl The Mesos master url
+ * @param queuedDrivers All drivers queued to be launched
+ * @param launchedDrivers All launched or running drivers
+ * @param finishedDrivers All terminated drivers
+ * @param retryList All drivers pending to be retried
+ */
+private[spark] class MesosClusterSchedulerState(
+val frameworkId: String,
+val masterUrl: Option[String],
+val queuedDrivers: Iterable[MesosDriverDescription],
+val launchedDrivers: Iterable[MesosClusterTaskState],
+val finishedDrivers: Iterable[MesosClusterTaskState],
+val retryList: Iterable[MesosDriverDescription])
+
+/**
+ * A Mesos scheduler that is responsible for launching submitted Spark 
drivers in cluster mode
+ * as Mesos tasks in a Mesos cluster.
+ * All drivers are launched asynchronously by the framework, which will 
eventually be launched
+ * by one of the 

[GitHub] spark pull request: [BUILD] Support building with SBT on encrypted...

2015-04-17 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/5546#issuecomment-93937847
  
Hm, doesn't the class file name affect how it's found? that's how the 
classloader finds the class. I also don't know of a specific instance where 
this created a problem, but the fact that it needs to be set means something 
about the output will change. What was the additional bit of info you mention 
that says this is safe? If it really is verifiably never going to change the 
linking result, then it doesn't matter, but seems like it would by changing 
file names.

Is building on an encrypted file system common? Yes the SBT build would 
only be for developers though the inconsistency is a little worrying, so I 
think it best to do it both places or neither.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6065] [MLlib] Optimize word2vec.findSyn...

2015-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5467#issuecomment-93942492
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30464/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6065] [MLlib] Optimize word2vec.findSyn...

2015-04-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5467#issuecomment-93942477
  
  [Test build #30464 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30464/consoleFull)
 for   PR 5467 at commit 
[`64575b0`](https://github.com/apache/spark/commit/64575b0282b350facc93340fbf653b38b0121b1a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >