[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-09 Thread wangjiaochun
Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239990587
  
--- Diff: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java ---
@@ -161,6 +161,10 @@ private void writeSortedFile(boolean isLastFile) {
 final ShuffleInMemorySorter.ShuffleSorterIterator sortedRecords =
   inMemSorter.getSortedIterator();
 
+// If there are no sorted records, so we don't need to create an empty 
spill file.
+if (!sortedRecords.hasNext()) {
+  return;
+}
--- End diff --

Okay, I will make the changes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-09 Thread wangjiaochun
Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239990083
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -235,6 +235,7 @@ public void writeEmptyIterator() throws Exception {
 final Option mapStatus = writer.stop(true);
 assertTrue(mapStatus.isDefined());
 assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

I think it's unnecessary to add  a new test case, and it can delete  line 
239~242 of this test writeEmptyIterator because they're always right.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23225


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r240014126
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -235,11 +235,8 @@ public void writeEmptyIterator() throws Exception {
 final Option mapStatus = writer.stop(true);
 assertTrue(mapStatus.isDefined());
 assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
 assertArrayEquals(new long[NUM_PARTITITONS], 
partitionSizesInMergedFile);
-assertEquals(0, taskMetrics.shuffleWriteMetrics().recordsWritten());
-assertEquals(0, taskMetrics.shuffleWriteMetrics().bytesWritten());
-assertEquals(0, taskMetrics.diskBytesSpilled());
-assertEquals(0, taskMetrics.memoryBytesSpilled());
--- End diff --

We need to keep these test coverage guaranteeing that the task metrics 
remained an untouched state.
Please revert this removal.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-07 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239990036
  
--- Diff: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java ---
@@ -161,6 +161,10 @@ private void writeSortedFile(boolean isLastFile) {
 final ShuffleInMemorySorter.ShuffleSorterIterator sortedRecords =
   inMemSorter.getSortedIterator();
 
+// If there are no sorted records, so we don't need to create an empty 
spill file.
+if (!sortedRecords.hasNext()) {
+  return;
+}
--- End diff --

If you're going to short-circuit, why not do it at the start of the 
function and save the rest of the work done above?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-07 Thread wangjiaochun
Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239989805
  
--- Diff: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java ---
@@ -161,6 +161,10 @@ private void writeSortedFile(boolean isLastFile) {
 final ShuffleInMemorySorter.ShuffleSorterIterator sortedRecords =
   inMemSorter.getSortedIterator();
 
+// If there are no sorted records, so we don't need to create an empty 
spill file.
+if (!sortedRecords.hasNext()) {
+  return;
+}
--- End diff --

I think it's better not to do that.Because change the original code style 
and it don't makes an appreciable difference in readability.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239872296
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -235,6 +235,7 @@ public void writeEmptyIterator() throws Exception {
 final Option mapStatus = writer.stop(true);
 assertTrue(mapStatus.isDefined());
 assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

BTW, since we are touching `ShuffleExternalSorter.java`, it seems that we 
need to add a test case to `ShuffleExternalSorterSuite`, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239860885
  
--- Diff: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java ---
@@ -161,6 +161,10 @@ private void writeSortedFile(boolean isLastFile) {
 final ShuffleInMemorySorter.ShuffleSorterIterator sortedRecords =
   inMemSorter.getSortedIterator();
 
+// If there are no sorted records, so we don't need to create an empty 
spill file.
+if (!sortedRecords.hasNext()) {
+  return;
+}
--- End diff --

@wangjiaochun . Shall we move line 160 ~ 167 to the beginning of this 
function (line 147)?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239859088
  
--- Diff: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java ---
@@ -161,6 +161,10 @@ private void writeSortedFile(boolean isLastFile) {
 final ShuffleInMemorySorter.ShuffleSorterIterator sortedRecords =
   inMemSorter.getSortedIterator();
 
+// If there are no sorted records, so we don't need to create an empty 
spill file.
+if (!sortedRecords.hasNext()) {
+  return;
+}
--- End diff --

@wangjiaochun . Can we move this line 160 ~ 167 to line 147? If then, we 
don't need to do `new SuffleWriteMetrics()` at all.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread wangjiaochun
Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239711919
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
+final UnsafeShuffleWriter writer = createWriter(true);
+writer.write(Iterators.emptyIterator());
+final Option mapStatus = writer.stop(true);
+assertTrue(mapStatus.isDefined());
+assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

That's it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239707130
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
+final UnsafeShuffleWriter writer = createWriter(true);
+writer.write(Iterators.emptyIterator());
+final Option mapStatus = writer.stop(true);
+assertTrue(mapStatus.isDefined());
+assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

Ya. That's the same what I observed. There is no need to keep both test 
cases because there is no code path like that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread wangjiaochun
Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239704796
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
+final UnsafeShuffleWriter writer = createWriter(true);
+writer.write(Iterators.emptyIterator());
+final Option mapStatus = writer.stop(true);
+assertTrue(mapStatus.isDefined());
+assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

I mean that before add code "if (sortedRecords.hasNext()) { return }" it 
will fail. now add assertEquals(0, spillFilesCreated.size()) to 
writeEmptyIterator seems good. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239702595
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
+final UnsafeShuffleWriter writer = createWriter(true);
+writer.write(Iterators.emptyIterator());
+final Option mapStatus = writer.stop(true);
+assertTrue(mapStatus.isDefined());
+assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

Do you mean that `writeEmptyIterator` will fail if we add this line 
`assertEquals(0, spillFilesCreated.size());` there?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread wangjiaochun
Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239700999
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
+final UnsafeShuffleWriter writer = createWriter(true);
+writer.write(Iterators.emptyIterator());
+final Option mapStatus = writer.stop(true);
+assertTrue(mapStatus.isDefined());
+assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

Test writeEmptyIterator() has create spill file although write  empty 
iterator,but the writeEmptyIteratorNotCreateEmptySpillFile test not create 
spill file while there has not records in memroy


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239697971
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
+final UnsafeShuffleWriter writer = createWriter(true);
+writer.write(Iterators.emptyIterator());
+final Option mapStatus = writer.stop(true);
+assertTrue(mapStatus.isDefined());
+assertTrue(mergedOutputFile.exists());
+assertEquals(0, spillFilesCreated.size());
--- End diff --

It seems that we can move this line to `writeEmptyIterator()` instead of 
making a new test case.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23225: [SPARK-26287][CORE]Don't need to create an empty ...

2018-12-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23225#discussion_r239697860
  
--- Diff: 
core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 
---
@@ -562,4 +562,18 @@ public void testPeakMemoryUsed() throws Exception {
 }
   }
 
+  @Test
+  public void writeEmptyIteratorNotCreateEmptySpillFile() throws Exception 
{
--- End diff --

This looks like mostly a duplication of `writeEmptyIterator`. Logically, do 
we need to keep both `writeEmptyIterator` and 
`writeEmptyIteratorNotCreateEmptySpillFile` seperately?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org