aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-745147943
Closing this one in favor of smaller PRs.
This is an automated message from the Apache Git Service.
To
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-742477298
The first PR with interfaces only is out.
This is an automated message from the Apache Git Service.
To
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-741711690
It is a bit hard to keep this large PR up-to-date since it touches many
places. As it seems like a reasonable approach, I am going to split the work
and submit smaller PRs.
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-739978109
Seems like there is consensus about evolving this API alongside the
interfaces in `read` package. I am not sure whether we need to move new
interfaces to `write`, though.
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-739974758
I've updated this PR and I am ready to split it into smaller mergeable
parts. It would be great if everyone could take another look to make sure we
are on the same page.
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-736320032
I know deprecating and then removing is usually a better idea and I will be
okay evolving read and write path separately. The only concern I have is that
while we use these
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-736316990
@dbtsai, I will rebase this one once PR #30558 is in.
This is an automated message from the Apache Git
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-735220811
We should agree on the future of the existing `Distribution` and
`ClusteredDistribution` interfaces used in `Partitioning`.
Here is a quote from the design doc:
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-735044798
also cc @dbtsai @dongjoon-hyun, it would be great to get your input on this
one after the holidays.
This
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-735044653
I also have a prototype for this logic in micro-batch streaming. I added
dedicated plans which I think we were missing for a while. Right now,
`MicroBatchExecution`
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-733842184
Tests failed as I overlooked recent changes around caching. Should be fixed
now.
This is an automated
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-733427234
I'd like to emphasize that all changes are in one place to simplify the
review. I'll split the work into smaller PRs later.
aokolnychyi commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-733425796
Okay, I went through the comments and I think they are all resolved except
points related to tests. This PR is no longer WIP and is ready for a detailed
review.
13 matches
Mail list logo