[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3222#issuecomment-65889555
  
  [Test build #24209 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24209/consoleFull)
 for   PR 3222 at commit 
[`5b2ef49`](https://github.com/apache/spark/commit/5b2ef49ab42a0cdcb309495dd47f8f436b139ed7).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3623][GraphX] GraphX should support the...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2631#issuecomment-65889881
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24208/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3623][GraphX] GraphX should support the...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2631#issuecomment-65889877
  
  [Test build #24208 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24208/consoleFull)
 for   PR 2631 at commit 
[`a70c500`](https://github.com/apache/spark/commit/a70c5001977b7ab0a10716f69190ed0a6a797d5d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3623][GraphX] GraphX should support the...

2014-12-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2631


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3623][GraphX] GraphX should support the...

2014-12-06 Thread ankurdave
Github user ankurdave commented on the pull request:

https://github.com/apache/spark/pull/2631#issuecomment-65890469
  
Thanks! Merged into master  branch-1.2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4620] Add unpersist in Graph and GraphI...

2014-12-06 Thread ankurdave
Github user ankurdave commented on the pull request:

https://github.com/apache/spark/pull/3476#issuecomment-65890609
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2374] Path pattern matching for GraphX

2014-12-06 Thread ankurdave
GitHub user ankurdave reopened a pull request:

https://github.com/apache/spark/pull/1307

[SPARK-2374] Path pattern matching for GraphX

Based on a 
[request](http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-traversal-and-merge-interesting-edges-td8788.html)
 on the mailing list, I wrote a simple implementation of path pattern matching 
for GraphX. It accepts patterns in the form of sequences of edge matchers, then 
iteratively propagates partial pattern matches to find all matching paths in 
the graph.

Though this is only an initial implementation and there are many 
opportunities for optimization, having this algorithm in the library expands 
the scope of GraphX beyond ML-like algorithms into graph traversal.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ankurdave/spark PatternMatching

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1307.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1307


commit bc546a22066365122b2cbf4402946b88ee81de7b
Author: Ankur Dave ankurd...@gmail.com
Date:   2014-07-05T09:46:34Z

Add graphx.lib.PatternMatching

commit 9332a4927a065e2e5217a4256a5bc12a127ca97b
Author: Ankur Dave ankurd...@gmail.com
Date:   2014-07-05T10:16:20Z

Fix comment error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4620] Add unpersist in Graph and GraphI...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3476#issuecomment-65890689
  
  [Test build #24210 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24210/consoleFull)
 for   PR 3476 at commit 
[`77a006a`](https://github.com/apache/spark/commit/77a006a77889a2f847dc0a6ad2e8e15e329b9137).
 * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2374] Path pattern matching for GraphX

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1307#issuecomment-65890780
  
  [Test build #24211 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24211/consoleFull)
 for   PR 1307 at commit 
[`9332a49`](https://github.com/apache/spark/commit/9332a4927a065e2e5217a4256a5bc12a127ca97b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3222#issuecomment-65891033
  
  [Test build #24209 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24209/consoleFull)
 for   PR 3222 at commit 
[`5b2ef49`](https://github.com/apache/spark/commit/5b2ef49ab42a0cdcb309495dd47f8f436b139ed7).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DBN(val stackedRBM: StackedRBM, val nn: MLP)`
  * `class MLP(`
  * `class RBM(`
  * `class StackedRBM(val innerRBMs: Array[RBM])`
  * `case class MinstItem(label: Int, data: Array[Int]) `
  * `class MinstDatasetReader(labelsFile: String, imagesFile: String)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, A...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3222#issuecomment-65891034
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24209/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4721][CORE] Improve logic while first t...

2014-12-06 Thread suyanNone
Github user suyanNone commented on the pull request:

https://github.com/apache/spark/pull/3582#issuecomment-65891778
  
 Sorry for my poor comments and English.

In all, 
1. we do put one thread by one thread until there have 1 thread succeed.
2. multiple doGetLocal threads and only 1 dropFromMemory thread will wait 1 
time whenever put is succeed or failed.  doGetLocal get failed, the return 
none. dropFromMemory get failed, return none.

There are 3 places call info.waitForReady()
1. doGetLocal
2. dropFromMemory
3. doPut

and if there are many thread try to put the same block.
for 1, do doGetLocal, I think just wait for one time(Wait1Condition, now 
renamed as OtherCondition), succeed or failed.
for 2, actually it will never have the situation if we call dropFromMemory 
but the block is not ready. but in current code there are have a 
info.waitForReady method call in dropFromMemory, just for compatibility, let's 
wait only one time(Wait1Condition) for block put succeed or failed. and also 
think, if we found one thread do the dropFromMemory, we should cancel all put 
threads.
for 3, do all put threads one by one untill there have a success or have a 
thread want drop it from memory as we described in 2. it may  can fails many 
times, so WaitNCondition(now named as PutCondition)

All I want to do for WaitType(now I rename BlockWaitCondition), just reuse 
enum convenience to call method and have a variable can record number of thread 
wait for that block finish put. and Each Block object have its own wait count, 
so I use extends Enumration.
   













---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4699][SQL] make caseSensitive configura...

2014-12-06 Thread jackylk
Github user jackylk commented on the pull request:

https://github.com/apache/spark/pull/3558#issuecomment-65892057
  
If we go for second way, it will create cyclic dependency between 
spark-catalyst and spark-sql sub-projects, because SQLConf and SQLContext is in 
spark-sql while Analyzer is in spark-catalyst.
I think the current way the only drawback is that caseSensitive can only be 
set while initializing SQLContext, but can not be set after initialization. If 
client want to use case insensitive analyzer, he need to create a new 
SQLContext, which I think it is probably OK.
I tested this way locally and it is passing SQLQuerySuite, I do not know 
why test case failed, as I can not access Jenkins test report now. Can anyone 
trigger Jenkins again, thanks.
And any more suggestion for better solution?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2374] Path pattern matching for GraphX

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1307#issuecomment-65892529
  
  [Test build #24211 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24211/consoleFull)
 for   PR 1307 at commit 
[`9332a49`](https://github.com/apache/spark/commit/9332a4927a065e2e5217a4256a5bc12a127ca97b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class EdgePattern[ED](attr: ED, matchDstFirst: Boolean = false) 
extends Serializable `
  * `case class EdgeMatch[ED](srcId: VertexId, dstId: VertexId, attr: ED) 
extends Serializable `
  * `case class Match[ED](path: List[EdgeMatch[ED]]) extends Serializable`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2374] Path pattern matching for GraphX

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1307#issuecomment-65892530
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24211/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4620] Add unpersist in Graph and GraphI...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3476#issuecomment-65892978
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24210/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4620] Add unpersist in Graph and GraphI...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3476#issuecomment-65892975
  
  [Test build #24210 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24210/consoleFull)
 for   PR 3476 at commit 
[`77a006a`](https://github.com/apache/spark/commit/77a006a77889a2f847dc0a6ad2e8e15e329b9137).
 * This patch **passes all tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4777][CORE] Some block memory after unr...

2014-12-06 Thread suyanNone
GitHub user suyanNone opened a pull request:

https://github.com/apache/spark/pull/3629

[SPARK-4777][CORE] Some block memory after unrollSafely not count into used 
memory(memoryStore.entrys or unrollMemory)

Some memory not count into memory used by memoryStore or unrollMemory.
Thread A after unrollsafely memory, it will release 40MB unrollMemory(40MB 
will used by other threads). then ThreadA wait get accountingLock to tryToPut 
blockA(30MB). before Thread A get accountingLock, blockA memory size is not 
counting into unrollMemory or memoryStore.currentMemory.
IIUC, freeMemory should minus that block memory

So, put this release memory into pending, and release it in tryToPut before 
ensureSpace 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/suyanNone/spark unroll-memory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3629.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3629


commit 072e43d49226f1ae660d9b2ad53dc43ee78481e9
Author: hushan[胡珊] hus...@xiaomi.com
Date:   2014-12-05T02:56:20Z

Pending unroll memory for this block untill tryToPut




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4777][CORE] Some block memory after unr...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3629#issuecomment-65894783
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] remove unnecessary import in spark-sql

2014-12-06 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/spark/pull/3630

[SQL] remove unnecessary import in spark-sql



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/spark remove

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3630.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3630


commit 150e7e0f4b0ec0eaa39736262d69c81d4ee83486
Author: Jacky Li jacky.li...@huawei.com
Date:   2014-12-06T16:16:41Z

remove unnecessary import




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] remove unnecessary import in spark-sql

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3630#issuecomment-65903938
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4691][Minor] Rewrite a few lines in shu...

2014-12-06 Thread maji2014
Github user maji2014 commented on the pull request:

https://github.com/apache/spark/pull/3553#issuecomment-65904987
  
NP, done for title change


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4714][CORE]: Add checking info is null ...

2014-12-06 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3574#issuecomment-65910884
  
Ah, I think I see your concern: let's say that we a block and there are two 
threads that are racing to perform operations on it: to use your example, 
thread A wants to call `removeBlock()` and thread B wants to call 
`dropFromMemory()`.  For this code to work correctly, we want it to operate 
correctly for all possible interleavings of those threads

If we consider the case where _all_ of thread A's steps execute before 
_any_ of thread B's, then things work okay: thread A will have removed the 
entry from `blockInfo` before thread B runs, so `B` will see that `info == 
null` and log a warning that the block has already been removed.  The same is 
true for B before A.

In another execution, though, both threads could find the `BlockInfo` 
instance in the `blockInfo` map but race on acquiring its lock 
(`info.synchronized`), so `info != null` will be true for both threads.  I 
agree that this could be a problem, but it might not be if the operations 
performed in those threads are idempotent.  Let's take a look and see if that's 
the case:

 - `removeBlock`: all of the operations here are idempotent, so at worst we 
get a warning if we run this on a block that's removed by another racing thread.

- `dropOldBlocks`: similarly, this just consists of calls to 
`*Store.remove()`, which are idempotent.

- `dropFromMemory`: this case might actually be problematic, since I think 
that this method calls data store operations that don't handle missing blocks.  
I'm going to look at this case in a little more detail, but I think that your 
fix for this might be a good idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4714][CORE]: Add checking info is null ...

2014-12-06 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3574#discussion_r21418593
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -1089,15 +1089,17 @@ private[spark] class BlockManager(
 val info = blockInfo.get(blockId).orNull
 if (info != null) {
   info.synchronized {
-// Removals are idempotent in disk store and memory store. At 
worst, we get a warning.
-val removedFromMemory = memoryStore.remove(blockId)
-val removedFromDisk = diskStore.remove(blockId)
-val removedFromTachyon = if (tachyonInitialized) 
tachyonStore.remove(blockId) else false
-if (!removedFromMemory  !removedFromDisk  !removedFromTachyon) 
{
-  logWarning(sBlock $blockId could not be removed as it was not 
found in either  +
-the disk, memory, or tachyon store)
+if (blockInfo.get(blockId).isEmpty) {
--- End diff --

I don't think that this extra check is necessary; check out my comment on 
the main pull request and see if you agree.

Even if we did want to add a check here, I think we want to check for 
`if(blockInfo.get(blockId).nonEmpty)`, since this branch handles the case where 
blocks have _not_ been removed already.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4714][CORE]: Add checking info is null ...

2014-12-06 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3574#discussion_r21418607
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -1126,12 +1128,14 @@ private[spark] class BlockManager(
   val (id, info, time) = (entry.getKey, entry.getValue.value, 
entry.getValue.timestamp)
   if (time  cleanupTime  shouldDrop(id)) {
 info.synchronized {
-  val level = info.level
-  if (level.useMemory) { memoryStore.remove(id) }
-  if (level.useDisk) { diskStore.remove(id) }
-  if (level.useOffHeap) { tachyonStore.remove(id) }
-  iterator.remove()
-  logInfo(sDropped block $id)
+  if (blockInfo.get(id).isEmpty) {
--- End diff --

Similarly, I don't think that we strictly _need_ a check here since the 
`remove(id)` operations are idempotent.  It might be nice to log a warning if 
the block has already been removed, but that might not be necessary since this 
is a background cleanup call.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4714][CORE]: Add checking info is null ...

2014-12-06 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3574#discussion_r21418614
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -1010,9 +1010,9 @@ private[spark] class BlockManager(
   info.synchronized {
 // required ? As of now, this will be invoked only for blocks 
which are ready
 // But in case this changes in future, adding for consistency sake.
-if (!info.waitForReady()) {
+if (blockInfo.get(blockId).isEmpty || !info.waitForReady()) {
--- End diff --

It might be nice to split this into an `if` and `else if` case so that we 
can log specific / accurate error messages in each of the cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4721][CORE] Improve logic while first t...

2014-12-06 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3582#issuecomment-65911726
  
To clarify a bit further: I think that `BlockInfo.waitForReady()` is 
designed to allow callers to block until a block write has completed.  If we 
have many threads (readers) waiting for a block to be written, then think we 
should be okay because `notifyAll()` will wake all of those threads when the 
block becomes ready.

From your description, it sounds like you're worried about a multiple 
writer-threads case, where we have many threads attempting to write the same 
block and a failed write attempt from _one_ of them wakes up the waiting 
threads and notifies them of a failure even though there's another write in 
progress which might succeed.  Is your goal to wait until _all_ of the pending 
writes have failed before notifying a reader that the write has failed and to 
wait for _one_ of them to succeed before notifying the reader that the write 
succeeded?

I'll have to dig into the BlockManager internals to see whether we can ever 
have multiple in-progress writes for the same block.  Do you have an example of 
when this can happen?

I'd be happy to look over the code and provide more feedback / suggestions, 
but I want to make sure that I understand the motivation and confirm that this 
is fixing an actual bug, since it seems like this adds a moderate amount of 
complexity.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] remove unnecessary import in spark-sql

2014-12-06 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3630#issuecomment-65911873
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] remove unnecessary import in spark-sql

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3630#issuecomment-65911912
  
  [Test build #24212 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24212/consoleFull)
 for   PR 3630 at commit 
[`150e7e0`](https://github.com/apache/spark/commit/150e7e0f4b0ec0eaa39736262d69c81d4ee83486).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2624 add datanucleus jars to the contain...

2014-12-06 Thread jimjh
Github user jimjh commented on the pull request:

https://github.com/apache/spark/pull/3238#issuecomment-65913114
  
Yea I should have been more careful. I agree that we should figure out a 
proper solution. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] remove unnecessary import in spark-sql

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3630#issuecomment-65914284
  
  [Test build #24212 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24212/consoleFull)
 for   PR 3630 at commit 
[`150e7e0`](https://github.com/apache/spark/commit/150e7e0f4b0ec0eaa39736262d69c81d4ee83486).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] remove unnecessary import in spark-sql

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3630#issuecomment-65914287
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24212/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add mesos specific configurations into doc

2014-12-06 Thread tnachen
Github user tnachen commented on the pull request:

https://github.com/apache/spark/pull/3349#issuecomment-65921947
  
@ash211 didn't know there is already a built in one, I updated this PR to 
use that instead. please take a look, and if it looks good please let me know 
who I should ping to push this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add mesos specific configurations into doc

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3349#issuecomment-65922015
  
  [Test build #24213 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24213/consoleFull)
 for   PR 3349 at commit 
[`737ef49`](https://github.com/apache/spark/commit/737ef4983d4e7f5221d49f132d4a17c9b999c71a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add mesos specific configurations into doc

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3349#issuecomment-65923795
  
  [Test build #24213 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24213/consoleFull)
 for   PR 3349 at commit 
[`737ef49`](https://github.com/apache/spark/commit/737ef4983d4e7f5221d49f132d4a17c9b999c71a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add mesos specific configurations into doc

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3349#issuecomment-65923798
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24213/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4469] [SQL] Move the SemanticAnalyzer f...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65926034
  
  [Test build #24214 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24214/consoleFull)
 for   PR 3336 at commit 
[`b85b620`](https://github.com/apache/spark/commit/b85b6204736855f0380d675494329e85d8a3948a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65926083
  
  [Test build #24215 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24215/consoleFull)
 for   PR 3336 at commit 
[`5d58812`](https://github.com/apache/spark/commit/5d5881214ecc7223bfc6e2fef04a46a457eb66f5).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65926374
  
  [Test build #24216 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24216/consoleFull)
 for   PR 3336 at commit 
[`4f97f14`](https://github.com/apache/spark/commit/4f97f144fa70be4ea13fe36dc22f868c455243f4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65927167
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24215/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65927166
  
  [Test build #24215 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24215/consoleFull)
 for   PR 3336 at commit 
[`5d58812`](https://github.com/apache/spark/commit/5d5881214ecc7223bfc6e2fef04a46a457eb66f5).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65927251
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24214/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65927249
  
  [Test build #24214 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24214/consoleFull)
 for   PR 3336 at commit 
[`b85b620`](https://github.com/apache/spark/commit/b85b6204736855f0380d675494329e85d8a3948a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65927454
  
  [Test build #24216 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24216/consoleFull)
 for   PR 3336 at commit 
[`4f97f14`](https://github.com/apache/spark/commit/4f97f144fa70be4ea13fe36dc22f868c455243f4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4769] [SQL] CTAS does not work when rea...

2014-12-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3336#issuecomment-65927457
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24216/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org