[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4783


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98350827
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98349631
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98349632
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31663/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98349628
  
  [Test build #31663 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31663/consoleFull)
 for   PR 4783 at commit 
[`c4dcb41`](https://github.com/apache/spark/commit/c4dcb411ce38eff25204dff487335e39b986cb05).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98342259
  
  [Test build #31663 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31663/consoleFull)
 for   PR 4783 at commit 
[`c4dcb41`](https://github.com/apache/spark/commit/c4dcb411ce38eff25204dff487335e39b986cb05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98342214
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98342225
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-9834
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98259159
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98259160
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31624/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98259151
  
  [Test build #31624 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31624/consoleFull)
 for   PR 4783 at commit 
[`c4dcb41`](https://github.com/apache/spark/commit/c4dcb411ce38eff25204dff487335e39b986cb05).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98257964
  
  [Test build #31624 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31624/consoleFull)
 for   PR 4783 at commit 
[`c4dcb41`](https://github.com/apache/spark/commit/c4dcb411ce38eff25204dff487335e39b986cb05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98257790
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98257798
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98257433
  
@JoshRosen @srowen, I already tested in the test suite. Just want to make 
sure if I don't miss anything obvious.
Here is the standalone reproduce sample.  
https://gist.github.com/advancedxy/08a76fb618a3f804e108
If that's the case, I believe there is the BlockManagerSuite to be changed 
too. @JoshRosen when would you have time to look at that? I can send a pr for 
that if you are busy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98202386
  
@advancedxy I can't answer the `ResetSystemPrroperties` question off the 
top of my head and am a bit busy with other stuff right now, so why not try 
writing a simple standalone example with some printlns or something to figure 
out the proper override order, etc?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98132414
  
  [Test build #31548 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31548/consoleFull)
 for   PR 4783 at commit 
[`3f80640`](https://github.com/apache/spark/commit/3f80640932c159ef3a92c83527a83f400f5ebc51).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class ExecutorUIData(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98132424
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31548/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98132423
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98126019
  
A closer look at the `ResetSystemProperties` and `SizeEstimatorSuite`, the 
`beforeEach` in `ResetSystemProperties` is overriden by `SizeEstimatorSuite`. 
Should call `super.beforeEach()` in `SizeEatimatorSuite` to be stackable. 

@JoshRosen, you added the `ResetSystemProperties` trait, should we add the 
` super.beforeEach()` in the `beforeEach()` method in the `SizeEstimatorSuite`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98121489
  
OK that all looks good. Let's see what Jenkins says.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r29498601
  
--- Diff: 
core/src/test/scala/org/apache/spark/util/SizeEstimatorSuite.scala ---
@@ -135,5 +160,20 @@ class SizeEstimatorSuite
 assertResult(64)(SizeEstimator.estimate(DummyString("a")))
 assertResult(64)(SizeEstimator.estimate(DummyString("ab")))
 assertResult(72)(SizeEstimator.estimate(DummyString("abcdefgh")))
+
--- End diff --

I didn't consider that. But after a look at the code, It seems the 
`ResetSystemProperties` trait already takes care of that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98116848
  
`initialize()` is invoked multiple times in the test suite. Maybe it's a 
hook for tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r29498301
  
--- Diff: 
core/src/test/scala/org/apache/spark/util/SizeEstimatorSuite.scala ---
@@ -59,6 +68,22 @@ class SizeEstimatorSuite
 assertResult(24)(SizeEstimator.estimate(new DummyClass4(null)))
 assertResult(48)(SizeEstimator.estimate(new DummyClass4(new 
DummyClass3)))
   }
+  
+  test("primitive wrapper objects") {
+assertResult(16)(SizeEstimator.estimate(new java.lang.Boolean(true)))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Byte("1")))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Character('1')))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Short("1")))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Integer(1)))
+assertResult(24)(SizeEstimator.estimate(new java.lang.Long(1)))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Float(1.0)))
--- End diff --

Thanks.  I do know 1.0 is a double. Maybe I didn't think clearly when I was 
writing the test cases. Just leave it? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98112201
  
  [Test build #31548 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31548/consoleFull)
 for   PR 4783 at commit 
[`3f80640`](https://github.com/apache/spark/commit/3f80640932c159ef3a92c83527a83f400f5ebc51).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98111075
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-9806
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98099024
  
Needs a rebase.

I know it's not changed in this PR, but why are all those fields `var`? 
they can be initialized once if the `initialize()` method is removed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r29496820
  
--- Diff: 
core/src/test/scala/org/apache/spark/util/SizeEstimatorSuite.scala ---
@@ -135,5 +160,20 @@ class SizeEstimatorSuite
 assertResult(64)(SizeEstimator.estimate(DummyString("a")))
 assertResult(64)(SizeEstimator.estimate(DummyString("ab")))
 assertResult(72)(SizeEstimator.estimate(DummyString("abcdefgh")))
+
--- End diff --

I'm worried that setting the system props above like `os.arch` is going to 
pollute the JVM state for other tests. It should be restored.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r29496696
  
--- Diff: 
core/src/test/scala/org/apache/spark/util/SizeEstimatorSuite.scala ---
@@ -59,6 +68,22 @@ class SizeEstimatorSuite
 assertResult(24)(SizeEstimator.estimate(new DummyClass4(null)))
 assertResult(48)(SizeEstimator.estimate(new DummyClass4(new 
DummyClass3)))
   }
+  
+  test("primitive wrapper objects") {
+assertResult(16)(SizeEstimator.estimate(new java.lang.Boolean(true)))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Byte("1")))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Character('1')))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Short("1")))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Integer(1)))
+assertResult(24)(SizeEstimator.estimate(new java.lang.Long(1)))
+assertResult(16)(SizeEstimator.estimate(new java.lang.Float(1.0)))
--- End diff --

Don't change this, but FYI 1.0f is a float and 1.0 is a double.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-98093859
  
ping @rxin or @srowen .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96815195
  
Thanks @advancedxy for fixing the tests. This change LGTM. @rxin @srowen 
Could you also take a final look at this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96800849
  
  [Test build #31061 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31061/consoleFull)
 for   PR 4783 at commit 
[`db1e948`](https://github.com/apache/spark/commit/db1e948097b202573cac16a23a8bf22f1d6e2a5b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96775081
  
  [Test build #31061 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31061/consoleFull)
 for   PR 4783 at commit 
[`db1e948`](https://github.com/apache/spark/commit/db1e948097b202573cac16a23a8bf22f1d6e2a5b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96773741
  
Jenkins, ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96769562
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96759699
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-04-27 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-96685117
  
Ping @shivaram, @rxin. I finally got some time to look at the code.
The ExternalSorterSuite kept failing because there was no spilling when we 
just insert 10 elements. 
I don't know how 10 was calculated, maybe you guys can give some 
related info. Before this pr, The size of  an integer instance is 24 bytes on a 
64-bit JVM with UseCompressedOops on. Now, It's a correct 16 bytes. The 
ExternalSorterSuit inserts integers into buffers, so a reduced size of integer 
class results a smaller total estimated size, hence no spilling. 

Another Note: When I was digging the ExternalSorterSuite and 
ExternalSorter, I noticed something wrong. The ExternalSorter uses the 
SizeTrackingPairBuffer, SizeTrackingPairBuffer[(Int, Int), Int] to be 
specifically in the test suite. The SizeTrackingPairBuffer[(Int, Int), Int] 
increases only 40 bytes per insertion(estimated by SizeEstimator). The 40 bytes 
 consisted of 24 bytes of Tuple2 and 16 bytes Integer. I changes the insertion 
number based on this fact (24 + 24 per insertion to 24 + 16 per insertion, 48 * 
10 = 40 * 12). However 24 bytes of Tuple2 is not right, It should be 
the specialized version of Tuple2[Int, Int] which is 32 bytes. It's not 
SizeEstimator's fault as It can only get the Tuple2 class rather than 
scala.Tuple2$mcII$sp. I doubt it may be something related to the scala 
compiler, I will look into this problem or someone more familiar with scala 
compiler should.

As for this pr, I think it's good to go. Hope we can get it merged before 
1.4


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-19 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-83878165
  
@rxin as I mentioned in the previous comment, I use the method in this 
article.

http://www.javaworld.com/article/2077496/testing-debugging/java-tip-130--do-you-know-your-data-size-.html

The related code is listed below. The scala library should be included in 
the classpath
```
// 

/**
 * A simple class to experiment with your JVM's garbage collector
 * and memory sizes for various data types.
 *
 * @author mailto:v...@trilogy.com";>Vladimir Roubtsov
 */
import scala.Tuple2;
import scala.Tuple2$mcII$sp;
public class Sizeof
{
public static class DummyClass1 {}
public static class DummyClass2 extends DummyClass1 {
public boolean x;
}
public static class DummyClass3 extends DummyClass2 {
public boolean y;
}

public static void main (String [] args) throws Exception
{
// "warm up" all classes/methods that we are going to use:
runGC ();
usedMemory ();

// array to keep strong references to allocated objects:
final int count = 1; // 1 or so is enough for small ojects
Object [] objects = new Object [count];


long heap1 = 0;

// allocate count+1 objects, discard the first one:
for (int i = -1; i < count; ++ i)
{
Object object;

// INSTANTIATE YOUR DATA HERE AND ASSIGN IT TO 'object':

object = new Tuple2(1, 2); // This gives 24 bytes. 64 bit vm
// object = new Tuple2$mcII$sp(1, 2); // This gives 32 bytes. 
64 bit vm

// This gives 56 bytes per object.
// 24B(Tuple2) + 16B(Integer)+ 16B(Integer) = 56B
// object = new Tuple2(2 * i + 1, 2 * i + 2);

if (i >= 0)
objects [i] = object;
else
{
object = null; // discard the "warmup" object
runGC ();
heap1 = usedMemory (); // take a "before" heap snapshot
}
}

runGC ();
long heap2 = usedMemory (); // take an "after" heap snapshot:

final int size = Math.round (((float)(heap2 - heap1))/count);
System.out.println ("'before' heap: " + heap1 +
", 'after' heap: " + heap2);
System.out.println ("heap delta: " + (heap2 - heap1) +
", {" + objects [0].getClass () + "} size = " + size + " 
bytes");
}

// a helper method for creating Strings of desired length
// and avoiding getting tricked by String interning:
public static String createString (final int length)
{
final char [] result = new char [length];
for (int i = 0; i < length; ++ i) result [i] = (char) i;

return new String (result);
}

// this is our way of requesting garbage collection to be run:
// [how aggressive it is depends on the JVM to a large degree, but
// it is almost always better than a single Runtime.gc() call]
private static void runGC () throws Exception
{
// for whatever reason it helps to call Runtime.gc()
// using several method calls:
for (int r = 0; r < 4; ++ r) _runGC ();
}

private static void _runGC () throws Exception
{
long usedMem1 = usedMemory (), usedMem2 = Long.MAX_VALUE;

for (int i = 0; (usedMem1 < usedMem2) && (i < 1000); ++ i)
{
s_runtime.runFinalization ();
s_runtime.gc ();
Thread.currentThread ().yield ();

usedMem2 = usedMem1;
usedMem1 = usedMemory ();
}
}

private static long usedMemory ()
{
return s_runtime.totalMemory () - s_runtime.freeMemory ();
}

private static final Runtime s_runtime = Runtime.getRuntime ();

} // end of class
// 

```

I took another look at the result. I think I got fooled by the result of 
Tuple2(1,2);  It gives 24 bytes as result, but the javac don't pick the 
specialized version class. The Tuple2$mcII$sp(1, 2) does give 32 bytes as 
result. So, The (1, 2) in Scala on 64 bit JVM do takes 32 bytes size. We get 
that right

The last thing left is to figure out why the ExternalSort keeping failing 
tests.


---
If your project is set up for it, you can reply to this email and have your
rep

[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-83749071
  
What do you mean in practice Tuple2 is 24 bytes? How did you measure that? 
If the object layout tool shows it takes 32 bytes, doesn't it just take 32 
bytes? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-19 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-83410489
  
@shivaram @srowen Tuple2(Int, Int) got specialized to Tuple2$mcII$sp class. 
But the Tuple2$mcII$sp is a subclass of Tuple2. So in our implementation, the 
specialized class will get two additional object references. (_1, _2 in 
superclass Tuple2, in our case). So, for Tuple2(Int, Int), SizeEstimator will 
give 32 bytes rather than 24 bytes. In theory, the Tuple2(1,2) class filed 
layout should be something like below.
```
scala.Tuple2$mcII$sp object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 ( 0001      )
4 4 (object header) 00 00 00 00 (       )
8 4 (object header) 05 c3 00 f8 ( 0101 1100 0011    1000)
12 4 Object Tuple2._1 null
16 4 Object Tuple2._2 null
20 4 int Tuple2$mcII$sp._1$mcI$sp 1
24 4 int Tuple2$mcII$sp._2$mcI$sp 2
28 4 (loss due to the next object alignment)
Instance size: 32 bytes (reported by Instrumentation API)
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
```

But in practice, the size of Tuple2(1, 2) is 24 bytes. So is there any 
scala expert we can ping? I really want to know why Tuple2(1, 2) can be 24 
bytes when the specialized version is involved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-12 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78435831
  
@shivaram The reason why ExternalSorter failed is that It doesn't spilling 
files for these two failure tests. 
( If we increase the input from `0 until 10` to `0 until 20`, It 
will spilling files to disks, which passes the tests) However the input type 
for sorter is Iterator[(Int, Int)], and the older SizeEstimator gives 32 bytes 
and the new SizeEstimator gives the same 32 bytes for (Int, Int) (64 bit JVM 
with UseCompressedOops on assumed). So, It's very wried to see different 
results.
Seems @mateiz is busy. @jerryshao cloud you take a look at the failed 
tests, since you wrote some of the tests.

And, I believe there is another bug in the current SizeEstimator, Scala 
specialized Int, Long, Float, Double for Tuples, So, The size of (Int, Int) 
should be 24 bytes, rather than 32bytes. Verified by the method introduced by 
this article 
http://www.javaworld.com/article/2077496/testing-debugging/java-tip-130--do-you-know-your-data-size-.html.

I will take a look at the size problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78411597
  
I don't know much about the ExternalSorter. You could try to walk through 
the code and try to see how the size estimator changes are affecting it.

@mateiz @rxin might know how it depends on the SizeEstimator


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78410311
  
@shivaram ExternalSorterSuite failed on my local machine too. Do you have 
any thought why ExternalSortedSuite is failing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78407837
  
I just run test-only for SizeEstimatorSuite. It do pass the tests. I 
thought the ExternalSorter was unrelated. I will test the ExternalSorter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78407056
  
@advancedxy Do the tests pass on your local machine ? It looks the 
ExternalSorter uses a size-tracking hash map which in turn depends on the 
SizeEstimator. So I am wondering if the test failure is actually related to 
this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78406725
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28486/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78406716
  
  [Test build #28486 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28486/consoleFull)
 for   PR 4783 at commit 
[`9f92469`](https://github.com/apache/spark/commit/9f924696b3ad536f474d24657302e01e96eacf47).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78401317
  
  [Test build #28486 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28486/consoleFull)
 for   PR 4783 at commit 
[`9f92469`](https://github.com/apache/spark/commit/9f924696b3ad536f474d24657302e01e96eacf47).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread advancedxy
Github user advancedxy commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r26209062
  
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -279,8 +315,8 @@ private[spark] object SizeEstimator extends Logging {
 newInfo
   }
 
-  private def alignSize(size: Long): Long = {
-val rem = size % ALIGN_SIZE
-if (rem == 0) size else (size + ALIGN_SIZE - rem)
-  }
+  private def alignSize(size: Long): Long = alignSizeUp(size, ALIGN_SIZE)
+
+  private def alignSizeUp(size: Long, alignSize: Int): Long =
+(size + alignSize - 1) & ~(alignSize - 1)
--- End diff --

I noticed this code when I looked at the HotSpot JVM classloader code. It's 
a c macro. And I think it's quite elegant. So I ported this computation. I will 
add some comment if needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread advancedxy
Github user advancedxy commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r26208786
  
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -257,21 +262,52 @@ private[spark] object SizeEstimator extends Logging {
 val parent = getClassInfo(cls.getSuperclass)
 var shellSize = parent.shellSize
 var pointerFields = parent.pointerFields
+val sizeCount = Array.fill(fieldSizes.max + 1)(0)
 
+// iterate through the fields of this class and gather information.
 for (field <- cls.getDeclaredFields) {
   if (!Modifier.isStatic(field.getModifiers)) {
 val fieldClass = field.getType
 if (fieldClass.isPrimitive) {
-  shellSize += primitiveSize(fieldClass)
+  sizeCount(primitiveSize(fieldClass)) += 1
 } else {
   field.setAccessible(true) // Enable future get()'s on this field
-  shellSize += pointerSize
+  sizeCount(pointerSize) += 1
   pointerFields = field :: pointerFields
 }
   }
 }
 
-shellSize = alignSize(shellSize)
+// Based on the simulated field layout code in Aleksey Shipilev's 
report:
+// 
http://cr.openjdk.java.net/~shade/papers/2013-shipilev-fieldlayout-latest.pdf
+// The code is in Figure 9.
+// The simplified idea of field layout consists of 4 parts (see more 
details in the report):
+//
+// 1. field alignment: HotSpot lays out the fields aligned by their 
size.
+// 2. object alignment: HotSpot rounds instance size up to 8 bytes
+// 3. consistent fields layouts throughout the hierarchy: This means 
we should layout
+// superclass first. And we can use superclass's shellSize as a 
starting point to layout the
+// other fields in this class.
+// 4. class alignment: HotSpot rounds field blocks up to to 
HeapOopSize not 4 bytes, confirmed
+// with Aleksey. see 
https://bugs.openjdk.java.net/browse/CODETOOLS-7901322
+//
+// The real world field layout is much more complicated. There are 
three kinds of fields
+// order in Java 8. And we don't consider the @contended annotation 
introduced by Java 8.
+// see the HotSpot classloader code, layout_fields method for more 
details.
+// 
hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/share/vm/classfile/classFileParser.cpp
+var alignedSize = shellSize
+for (size <- fieldSizes if sizeCount(size) > 0) {
+  val count = sizeCount(size)
+  // If there are internal gaps, smaller field can fit in.
+  alignedSize = math.max(alignedSize, alignSizeUp(shellSize, size) + 
size * count)
+  shellSize += size * count
+}
+
+// we should choose a larger size and clearly alignedSize >= shellSize.
+shellSize = alignedSize
+
+// round up instance filed blocks
+shellSize = alignSizeUp(shellSize, pointerSize)
--- End diff --

Yes. I thought just using the alignedSize in the first arg before. But like 
the comment said, It has two steps. The first assignment is to get the internal 
aligned size. The second assignment is to round up instance field blocks. I 
thought it wold be more clear to assigned twice. I will change it if you think 
it's funny.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r26208393
  
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -257,21 +262,52 @@ private[spark] object SizeEstimator extends Logging {
 val parent = getClassInfo(cls.getSuperclass)
 var shellSize = parent.shellSize
 var pointerFields = parent.pointerFields
+val sizeCount = Array.fill(fieldSizes.max + 1)(0)
 
+// iterate through the fields of this class and gather information.
 for (field <- cls.getDeclaredFields) {
   if (!Modifier.isStatic(field.getModifiers)) {
 val fieldClass = field.getType
 if (fieldClass.isPrimitive) {
-  shellSize += primitiveSize(fieldClass)
+  sizeCount(primitiveSize(fieldClass)) += 1
 } else {
   field.setAccessible(true) // Enable future get()'s on this field
-  shellSize += pointerSize
+  sizeCount(pointerSize) += 1
   pointerFields = field :: pointerFields
 }
   }
 }
 
-shellSize = alignSize(shellSize)
+// Based on the simulated field layout code in Aleksey Shipilev's 
report:
+// 
http://cr.openjdk.java.net/~shade/papers/2013-shipilev-fieldlayout-latest.pdf
+// The code is in Figure 9.
+// The simplified idea of field layout consists of 4 parts (see more 
details in the report):
+//
+// 1. field alignment: HotSpot lays out the fields aligned by their 
size.
+// 2. object alignment: HotSpot rounds instance size up to 8 bytes
+// 3. consistent fields layouts throughout the hierarchy: This means 
we should layout
+// superclass first. And we can use superclass's shellSize as a 
starting point to layout the
+// other fields in this class.
+// 4. class alignment: HotSpot rounds field blocks up to to 
HeapOopSize not 4 bytes, confirmed
+// with Aleksey. see 
https://bugs.openjdk.java.net/browse/CODETOOLS-7901322
+//
+// The real world field layout is much more complicated. There are 
three kinds of fields
+// order in Java 8. And we don't consider the @contended annotation 
introduced by Java 8.
+// see the HotSpot classloader code, layout_fields method for more 
details.
+// 
hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/share/vm/classfile/classFileParser.cpp
+var alignedSize = shellSize
+for (size <- fieldSizes if sizeCount(size) > 0) {
+  val count = sizeCount(size)
+  // If there are internal gaps, smaller field can fit in.
+  alignedSize = math.max(alignedSize, alignSizeUp(shellSize, size) + 
size * count)
+  shellSize += size * count
+}
+
+// we should choose a larger size and clearly alignedSize >= shellSize.
+shellSize = alignedSize
+
+// round up instance filed blocks
+shellSize = alignSizeUp(shellSize, pointerSize)
--- End diff --

Can the first arg just be `alignedSize`? I found it slightly funny to see 
the variable assigned twice.
Otherwise looking good to me though I'm not so familiar with this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4783#discussion_r26208364
  
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -279,8 +315,8 @@ private[spark] object SizeEstimator extends Logging {
 newInfo
   }
 
-  private def alignSize(size: Long): Long = {
-val rem = size % ALIGN_SIZE
-if (rem == 0) size else (size + ALIGN_SIZE - rem)
-  }
+  private def alignSize(size: Long): Long = alignSizeUp(size, ALIGN_SIZE)
+
+  private def alignSizeUp(size: Long, alignSize: Int): Long =
+(size + alignSize - 1) & ~(alignSize - 1)
--- End diff --

Can you perhaps explain this computation? I get what it does, but it's not 
obvious.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-11 Thread advancedxy
Github user advancedxy commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-78223640
  
ping @shivaram, @srowen and @mateiz 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-77672792
  
  [Test build #28363 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28363/consoleFull)
 for   PR 4783 at commit 
[`cb0af95`](https://github.com/apache/spark/commit/cb0af956daba1ccd9d2633ccdde20163def1ec68).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4783#issuecomment-77672798
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28363/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org