subject:"\[GitHub\] spark pull request #18654\: \[SPARK\-21435\]\[SQL\] Empty files should be skipped ..."

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-18 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18654


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-18 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r128134922
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 ---
@@ -236,7 +236,10 @@ object FileFormatWriter extends Logging {
 committer.setupTask(taskAttemptContext)
 
 val writeTask =
-  if (description.partitionColumns.isEmpty && 
description.bucketIdExpression.isEmpty) {
+  if (sparkPartitionId != 0 && !iterator.hasNext) {
--- End diff --

cc @yhuai too who reviewed my similar PR before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127888746
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempPath { dir =>
--- End diff --

More clear :) No need to create source files in real.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127876852
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempPath { dir =>
--- End diff --

Could we maybe just do as below?

```scala
withTempPath { path =>
  spark.range(100).repartition(10).where("id = 50").write.parquet(path)
  val partFiles = path.listFiles()
.filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
  assert(partFiles.length === 2)
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127872988
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
+
+  withTempDir { dst_dir =>
+dst_dir.delete()
+df.where("id = 50").write.parquet(dst_dir.toString)
+val allFiles = dst_dir.listFiles(new FilenameFilter {
+  override def accept(dir: File, name: String): Boolean = {
+!name.startsWith(".") && !name.startsWith("_")
+  }
+})
+// First partition file and the data file
--- End diff --

Can't agree more,  firstly I try to implement like this but the 
`FileFormatWriter.write` can only see the iterator of each task self.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127869754
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
+
+  withTempDir { dst_dir =>
+dst_dir.delete()
+df.where("id = 50").write.parquet(dst_dir.toString)
--- End diff --

I mean..  for example, if we happen to have a single partition in the `df` 
in any event, I guess this test will become invalid ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127868378
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
+
+  withTempDir { dst_dir =>
+dst_dir.delete()
+df.where("id = 50").write.parquet(dst_dir.toString)
--- End diff --

I was thinking just in order to make sure the (previous) number of files 
written out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127867549
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
+
+  withTempDir { dst_dir =>
+dst_dir.delete()
+df.where("id = 50").write.parquet(dst_dir.toString)
+val allFiles = dst_dir.listFiles(new FilenameFilter {
+  override def accept(dir: File, name: String): Boolean = {
+!name.startsWith(".") && !name.startsWith("_")
+  }
+})
+// First partition file and the data file
--- End diff --

Ideally we only need the first partition file if all other partitions are 
empty, but this is hard to do right now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127867486
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
+
+  withTempDir { dst_dir =>
+dst_dir.delete()
+df.where("id = 50").write.parquet(dst_dir.toString)
--- End diff --

why we need repartition?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127867380
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
--- End diff --

+1 for the shorter one


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127867341
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127867290
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 ---
@@ -236,7 +236,10 @@ object FileFormatWriter extends Logging {
 committer.setupTask(taskAttemptContext)
 
 val writeTask =
-  if (description.partitionColumns.isEmpty && 
description.bucketIdExpression.isEmpty) {
+  if (sparkPartitionId != 0 && !iterator.hasNext) {
--- End diff --

cc @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127867254
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 ---
@@ -236,7 +236,10 @@ object FileFormatWriter extends Logging {
 committer.setupTask(taskAttemptContext)
 
 val writeTask =
-  if (description.partitionColumns.isEmpty && 
description.bucketIdExpression.isEmpty) {
+  if (sparkPartitionId != 0 && !iterator.hasNext) {
--- End diff --

This is a little hacky but is the simplest fix I think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127865091
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
--- End diff --

OK, I'll remove this assert and leave a note.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127860788
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
--- End diff --

but I guess this one (the latter) does not test this change? If this test 
passes regardless of this PR change, I would rather remove this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127860327
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
--- End diff --

Yea. If both are okay, let's go for the shorter one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127856720
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
--- End diff --

Just make sure the source dir have many files, and the output dir only have 
2 files.
If this make people confuse, just leave a notes and delete the assert?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127856498
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
--- End diff --

Both is ok I think, just copy this from `HadoopFsRelationSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127724518
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
--- End diff --

Could I ask what this test targets? I think I am lost around here ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127722941
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
--- End diff --

`withTempPath` can be used instead I believe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127724839
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
+override def accept(dir: File, name: String): Boolean = {
+  !name.startsWith(".") && !name.startsWith("_")
+}
+  })
+  assert(allFiles.length == 10)
+
+  withTempDir { dst_dir =>
+dst_dir.delete()
+df.where("id = 50").write.parquet(dst_dir.toString)
--- End diff --

I would explicitly repartition here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127723973
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileFormatWriterSuite extends QueryTest with SharedSQLContext {
+
+  test("empty file should be skipped while write to file") {
+withTempDir { dir =>
+  dir.delete()
+  spark.range(1).repartition(10).write.parquet(dir.toString)
+  val df = spark.read.parquet(dir.toString)
+  val allFiles = dir.listFiles(new FilenameFilter {
--- End diff --

Can we just do this simpler? for example,

```
.listFiles().filter { f =>
  f.isFile && !f.getName.startsWith(".") && !f.getName.startsWith("_")
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18654#discussion_r127722734
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 ---
@@ -236,7 +236,10 @@ object FileFormatWriter extends Logging {
 committer.setupTask(taskAttemptContext)
 
 val writeTask =
-  if (description.partitionColumns.isEmpty && 
description.bucketIdExpression.isEmpty) {
+  if (sparkPartitionId != 0 && !iterator.hasNext) {
--- End diff --

I guess this might be okay in that sense we are guaranteed partitions to be 
started from 0 up to my knowledge. Could you take a look and see if it makes 
sense to you cc @cloud-fan if you don't mind? I am not confident enough to 
proceed reviewing and leave a sign-off.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

2017-07-17 Thread xuanyuanking

GitHub user xuanyuanking opened a pull request:

https://github.com/apache/spark/pull/18654

[SPARK-21435][SQL] Empty files should be skipped while write to file

## What changes were proposed in this pull request?

Add EmptyDirectoryWriteTask for empty task while writing files.

## How was this patch tested?

Add new test in `FileFormatWriterSuite `

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuanyuanking/spark SPARK-21435

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18654.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18654


commit ff92ba3ae3abaae0d8ce8bb8c08465f43c14906f
Author: xuanyuanking 
Date:   2017-07-17T07:38:56Z

empty files should be skipped while write to file

commit e08fb1939d5267a38f3318af7506b6ed8628ebbf
Author: xuanyuanking 
Date:   2017-07-17T10:20:34Z

Handle the empty result of parquet

commit 6153001bc42deee197030ad91fbb4f72bd1aa5d3
Author: xuanyuanking 
Date:   2017-07-17T12:09:08Z

Modify UT and add more notes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

[GitHub] spark pull request #18654: [SPARK-21435][SQL] Empty files should be skipped ...

24 matches

Site Navigation

Mail list logo

Footer information