[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20435 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20435#discussion_r165048049 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala --- @@ -403,7 +403,7 @@ class MicroBatchExecution( val current = committedOffsets.get(reader).map(off => reader.deserializeOffset(off.json)) reader.setOffsetRange( toJava(current), -Optional.of(available.asInstanceOf[OffsetV2])) +Optional.of(available.asInstanceOf[streaming.Offset])) --- End diff -- shall we still use `OffsetV2`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20435#discussion_r165047744 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceOffset.scala --- @@ -20,14 +20,15 @@ package org.apache.spark.sql.kafka010 import org.apache.kafka.common.TopicPartition import org.apache.spark.sql.execution.streaming.{Offset, SerializedOffset} -import org.apache.spark.sql.sources.v2.streaming.reader.{Offset => OffsetV2, PartitionOffset} +import org.apache.spark.sql.sources.v2.reader.streaming.{Offset => OffsetV2, PartitionOffset} /** * An [[Offset]] for the [[KafkaSource]]. This one tracks all partitions of subscribed topics and * their offsets. */ private[kafka010] -case class KafkaSourceOffset(partitionToOffsets: Map[TopicPartition, Long]) extends OffsetV2 { +case class KafkaSourceOffset(partitionToOffsets: Map[TopicPartition, Long]) + extends OffsetV2 { --- End diff -- unnecessary change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20435#discussion_r164953225 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceOffset.scala --- @@ -20,14 +20,16 @@ package org.apache.spark.sql.kafka010 import org.apache.kafka.common.TopicPartition import org.apache.spark.sql.execution.streaming.{Offset, SerializedOffset} -import org.apache.spark.sql.sources.v2.streaming.reader.{Offset => OffsetV2, PartitionOffset} +import org.apache.spark.sql.sources.v2.reader.streaming +import org.apache.spark.sql.sources.v2.reader.streaming.PartitionOffset /** * An [[Offset]] for the [[KafkaSource]]. This one tracks all partitions of subscribed topics and --- End diff -- Should this `Offset` be `streaming.Offset`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20435#discussion_r164953380 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/memoryV2.scala --- @@ -30,9 +30,8 @@ import org.apache.spark.sql.catalyst.plans.logical.{LeafNode, Statistics} import org.apache.spark.sql.catalyst.streaming.InternalOutputModes.{Append, Complete, Update} import org.apache.spark.sql.execution.streaming.Sink import org.apache.spark.sql.sources.v2.{DataSourceOptions, DataSourceV2} -import org.apache.spark.sql.sources.v2.streaming.StreamWriteSupport -import org.apache.spark.sql.sources.v2.streaming.writer.StreamWriter -import org.apache.spark.sql.sources.v2.writer._ +import org.apache.spark.sql.sources.v2.writer.{StreamWriteSupport, _} --- End diff -- `import org.apache.spark.sql.sources.v2.writer._`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20435#discussion_r164943009 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceOffset.scala --- @@ -20,14 +20,16 @@ package org.apache.spark.sql.kafka010 import org.apache.kafka.common.TopicPartition import org.apache.spark.sql.execution.streaming.{Offset, SerializedOffset} -import org.apache.spark.sql.sources.v2.streaming.reader.{Offset => OffsetV2, PartitionOffset} +import org.apache.spark.sql.sources.v2.reader.streaming +import org.apache.spark.sql.sources.v2.reader.streaming.PartitionOffset --- End diff -- can we keep it same as before? ``` import org.apache.spark.sql.sources.v2.reader.streaming.{Offset => OffsetV2, PartitionOffset} ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20435: [SPARK-23268][SQL]Reorganize packages in data sou...
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/20435 [SPARK-23268][SQL]Reorganize packages in data source V2 ## What changes were proposed in this pull request? 1. create a new package for partitioning/distribution related classes. As Spark will add new concrete implementations of `Distribution` in new releases, it is good to have a new package for partitioning/distribution related classes. 2. move streaming related class to package `org.apache.spark.sql.sources.v2.reader/writer.streaming`, instead of `org.apache.spark.sql.sources.v2.streaming.reader/writer`. So that the there won't be package reader/writer inside package streaming, which is quite confusing. ## How was this patch tested? Unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gengliangwang/spark new_pkg Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20435.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20435 commit da29b863970b8ca5039d547ae5240e5c34f9f11a Author: Wang Gengliang Date: 2018-01-30T08:31:10Z create a new package for partitioning/distribution related classes commit 3dc56226ed17637c808df9d8d03f60fcbbae9e55 Author: Wang Gengliang Date: 2018-01-30T09:16:01Z re-org streaming --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org