Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r180933380
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala
---
@@ -53,32 +53,24 @@ class ContinuousSuiteBase extends
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/21048
[SPARK-23966][SS] Refactoring all checkpoint file writing logic in a common
CheckpointFileManager interface
## What changes were proposed in this pull request?
Checkpoint files (offset log
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20983#discussion_r180553748
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/EpochCoordinatorSuite.scala
---
@@ -0,0 +1,225 @@
+/*
+ * Licensed
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179596647
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212 @@
+/*
--- End
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179603916
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179594450
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala
---
@@ -43,8 +45,39 @@ object MemoryStream {
protected val
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179603105
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179604298
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179602477
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179601495
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212 @@
+/*
--- End
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179617487
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179617590
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179599245
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala
---
@@ -53,32 +53,24 @@ class ContinuousSuiteBase extends
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179617149
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179605302
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179618466
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20828#discussion_r179606525
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousMemoryStream.scala
---
@@ -0,0 +1,212
Repository: spark
Updated Branches:
refs/heads/master 7cf9fab33 -> 66a3a5a2d
[SPARK-23099][SS] Migrate foreach sink to DataSourceV2
## What changes were proposed in this pull request?
Migrate foreach sink to DataSourceV2.
Since the previous attempt at this PR #20552, we've changed and
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20951
LGTM.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20951#discussion_r178648761
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/ForeachSinkSuite.scala
---
@@ -141,7 +141,7 @@ class ForeachSinkSuite extends
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20951#discussion_r178648616
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/ForeachSinkSuite.scala
---
@@ -131,7 +131,7 @@ class ForeachSinkSuite extends
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20951#discussion_r178648434
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/ForeachSinkSuite.scala
---
@@ -131,7 +131,7 @@ class ForeachSinkSuite extends
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20958
We have made it clear that sockets is ONLY for testing and will not recover
data from checkpoints. So I see no problem that it throws errors when
attempting to recover. May we can improve the error
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20958#discussion_r178646725
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
---
@@ -238,6 +238,10 @@ final class DataStreamWriter[T] private[sql
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20951#discussion_r178645775
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ForeachSink.scala
---
@@ -17,52 +17,81 @@
package
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20951#discussion_r178645279
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ForeachSink.scala
---
@@ -17,52 +17,81 @@
package
dant.
Author: Tathagata Das <tathagata.das1...@gmail.com>
Closes #20941 from tdas/SPARK-23827.
(cherry picked from commit 15298b99ac8944e781328423289586176cf824d7)
Signed-off-by: Tathagata Das <tathagata.das1...@gmail.com>
Project: http://git-wip-us.apache.org/repos/asf/spark/re
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20941
Started more tests to test for flakiness.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20941
@brkyvz @zsxwing can one of you take a look?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20941
Spark 23827
## What changes were proposed in this pull request?
Currently, the requiredChildDistribution does not specify the partitions.
This can cause the weird corner cases where
Repository: spark
Updated Branches:
refs/heads/master 35997b59f -> c68ec4e6a
[SPARK-23096][SS] Migrate rate source to V2
## What changes were proposed in this pull request?
This PR migrate micro batch rate source to V2 API and rewrite UTs to suite V2
test.
## How was this patch tested?
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20906#discussion_r177244722
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousWriteExec.scala
---
@@ -0,0 +1,122
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20906#discussion_r177245022
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousWriteExec.scala
---
@@ -0,0 +1,122
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20906#discussion_r177275489
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousWriteExec.scala
---
@@ -0,0 +1,122
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20906#discussion_r177243246
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousWriteExec.scala
---
@@ -0,0 +1,122
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20906
jenkins, retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20859#discussion_r175580404
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala
---
@@ -160,6 +160,19 @@ object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20859#discussion_r175580451
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
---
@@ -140,6 +140,21 @@ class
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20859#discussion_r175579879
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala
---
@@ -160,6 +160,19 @@ object
Github user tdas closed the pull request at:
https://github.com/apache/spark/pull/20848
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
Just to be clear, I am not saying that we *have to* move to this pool
stuff. I am just saying that if we want to make this more robust, then we
should try to use existing tools (after careful
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
@tedyu It was indeed hard to find :) But apache commons pool does expose
metrics on idle/active counts. See
https://commons.apache.org/proper/commons-pool/apidocs/org/apache/commons/pool2/impl
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20848
[SPARK-23623][SS] Avoid concurrent use of cached consumers in
CachedKafkaConsumer (branch-2.3)
This is a backport of #20767 to branch 2.3
## What changes were proposed in this pull request
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
@tedyu @zsxwing My thoughts on this is that we should consider migrating to
something like Apache Common Pool (assuming it does not require additional
maven libraries), which might be less maintenance
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
The idea is good. But how do you propose exposing that information?
Periodic print out in the log?
From a different angle, I would rather not do feature creep in this PR
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
@tedyu @zsxwing thank you very much for catching the bugs. I have
simplified the logic quite a bit. Note that I removed the invariant that I had
introduced earlier. Additionally, I locally ran
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r174973494
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +415,103 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r174968594
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +415,103 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r174968294
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +415,103 @@ private[kafka010] object
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
@koeninger good question Cody! I think we should fix this limitation
eventually. The only reason I am not doing that in this PR is to keep the
changes minimum for backporting to 2.3.x. Eventually, we
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173602790
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +415,103 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173602735
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +415,103 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173602480
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +415,103 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173602442
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -27,30 +27,73 @@ import
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173351789
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +401,65 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173341089
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +401,65 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173340037
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +401,65 @@ private[kafka010] object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20767#discussion_r173338056
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
---
@@ -342,80 +401,65 @@ private[kafka010] object
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
jenkins retest this
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user tdas closed the pull request at:
https://github.com/apache/spark/pull/20765
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Relation [value#66]
+- Project [(value#12 % 5) AS key#9, value#12]
+- Project [value#66 AS value#12]// solution: project with aliases
+- LocalRelation [value#66]
```
## How was this patch tested?
New unit test
Author: Tathagata Das <tathagata.das1...@gmail.com>
Closes #207
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20767
@zsxwing @brkyvz PTAL.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20767
Fixed
## What changes were proposed in this pull request?
CacheKafkaConsumer in the project `kafka-0-10-sql` is designed to maintain
a pool of KafkaConsumers that can be reused. However
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20765
[SPARK-23406][SS] Enable stream-stream self-joins for branch-2.3
This is a backport of #20598.
## What changes were proposed in this pull request?
Solved two bugs to enable stream
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20688#discussion_r172730994
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
---
@@ -24,8 +24,8 @@ import
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20688#discussion_r172730333
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/RateSourceSuite.scala
---
@@ -0,0 +1,344 @@
+/*
+ * Licensed
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20688#discussion_r172729894
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateSourceProvider.scala
---
@@ -0,0 +1,291 @@
+/*
+ * Licensed
Github user tdas closed the pull request at:
https://github.com/apache/spark/pull/20755
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20755
[SPARK-23406][SS] Enable stream-stream self-joins for branch-2.3
## What changes were proposed in this pull request?
This is limited but safe-to-backport version of self-join-fix made in
#20598
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20710
@rdblue @jose-torres arrgh... i didnt notice that you guys were still
commenting before i merged it.
feel free to continue discussion and if any change is needed we will deal
Repository: spark
Updated Branches:
refs/heads/master ba622f45c -> b0f422c38
[SPARK-23559][SS] Add epoch ID to DataWriterFactory.
## What changes were proposed in this pull request?
Add an epoch ID argument to DataWriterFactory for use in streaming. As a side
effect of passing in this
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r172081660
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriter.java
---
@@ -31,13 +31,17 @@
* the {@link #write(Object)}, {@link
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20698
Thank you. Merging to master only as this is a new feature touching
production code paths.
---
-
To unsubscribe, e-mail: reviews
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993983
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/StreamWriter.java
---
@@ -39,21 +36,21 @@
* If this method fails
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993716
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriterFactory.java
---
@@ -48,6 +48,9 @@
* same task
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993622
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriterFactory.java
---
@@ -48,6 +48,9 @@
* same task
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993559
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriter.java
---
@@ -36,8 +36,9 @@
* {@link DataSourceWriter#commit
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993519
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriter.java
---
@@ -36,8 +36,9 @@
* {@link DataSourceWriter#commit
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993467
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriter.java
---
@@ -36,8 +36,9 @@
* {@link DataSourceWriter#commit
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993276
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
---
@@ -172,17 +173,19 @@ object
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993101
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
---
@@ -198,7 +201,7 @@ object DataWritingSparkTask
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20710#discussion_r171993066
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriterFactory.java
---
@@ -48,6 +48,9 @@
* same task
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171989658
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala
---
@@ -0,0 +1,106
Repository: spark
Updated Branches:
refs/heads/master 3a4d15e5d -> 707e6506d
[SPARK-23097][SQL][SS] Migrate text socket source to V2
## What changes were proposed in this pull request?
This PR moves structured streaming text socket source to V2.
Questions: do we need to remove old "socket"
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20382
LGTM. Merging to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20382#discussion_r171954250
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/TextSocketStreamSuite.scala
---
@@ -0,0 +1,300 @@
+/*
+ * Licensed
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171750758
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala
---
@@ -0,0 +1,105
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171750580
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala
---
@@ -0,0 +1,105
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171750437
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala
---
@@ -0,0 +1,105
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171741765
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala
---
@@ -0,0 +1,105
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171732015
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchReader.scala
---
@@ -199,10 +179,10 @@ private[kafka010] class
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20703
I completely agree with @zsxwing, let understand what the issue is rather
than covering it up with a workaround. We should not run into such issue at
all
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20382
relevant test failed. please make sure that there is no flakiness in the
tests.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20382#discussion_r171510614
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/TextSocketStreamSuite.scala
---
@@ -0,0 +1,300 @@
+/*
+ * Licensed
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20382
@jerryshao please address the above comment, then we are good to merge!
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user tdas commented on the issue:
https://github.com/apache/spark/pull/20698
@zsxwing @jose-torres
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20698#discussion_r171441133
--- Diff:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetRangeCalculator.scala
---
@@ -0,0 +1,105
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/20698
[SPARK-23541][SS] Allow Kafka source to read data with greater parallelism
than the number of topic-partitions
## What changes were proposed in this pull request?
Currently, when the Kafka
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20382#discussion_r171226477
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/socket.scala
---
@@ -61,13 +68,13 @@ class TextSocketSource(host: String
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/20382#discussion_r171225853
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/TextSocketStreamSuite.scala
---
@@ -0,0 +1,300 @@
+/*
+ * Licensed
501 - 600 of 7484 matches
Mail list logo