Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-04 Thread Terry Kim
+1 (non-binding). Thanks John!

Terry

On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu  wrote:

> +1 (non-binding)
> Best,
>
> Yufei
>
> `This is not a contribution`
>
>
> On Fri, Feb 4, 2022 at 11:54 AM huaxin gao  wrote:
>
>> +1 (non-binding)
>>
>> On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:
>>
>>> +1
>>>
>>> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
>>> >
>>> > +1 (non-binding). Looking forward to this feature!
>>> >
>>> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
>>> >>
>>> >> +1 for the SPIP. I think it's well designed and it has worked quite
>>> well at Netflix for a long time.
>>> >>
>>> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
>>> >>>
>>> >>> Hi Spark community,
>>> >>>
>>> >>> I’d like to restart the vote for the ViewCatalog design proposal
>>> (SPIP).
>>> >>>
>>> >>> The proposal is to add a ViewCatalog interface that can be used to
>>> load, create, alter, and drop views in DataSourceV2.
>>> >>>
>>> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
>>> >>>
>>> >>> [ ] +1: Accept the proposal as an official SPIP
>>> >>> [ ] +0
>>> >>> [ ] -1: I don’t think this is a good idea because …
>>> >>>
>>> >>> Thanks!
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Ryan Blue
>>> >> Tabular
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>


Re: [VOTE] Release Spark 3.2.0 (RC7)

2021-10-11 Thread Terry Kim
+1 (non-binding)

Thanks,
Terry

On Wed, Oct 6, 2021 at 9:49 AM Gengliang Wang  wrote:

> Please vote on releasing the following candidate as
> Apache Spark version 3.2.0.
>
> The vote is open until 11:59pm Pacific time October 11 and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.2.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.2.0-rc7 (commit
> 5d45a415f3a29898d92380380cfd82bfc7f579ea):
> https://github.com/apache/spark/tree/v3.2.0-rc7
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc7-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1394
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc7-docs/
>
> The list of bug fixes going into 3.2.0 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12349407
>
> This release is using the release script of the tag v3.2.0-rc7.
>
>
> FAQ
>
> =
> How can I help test this release?
> =
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.2.0?
> ===
> The current list of open tickets targeted at 3.2.0 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.2.0
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>


Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-24 Thread Terry Kim
+1 (non-binding)

Tested against .NET for Apache Spark.

Thanks,
Terry

On Wed, Feb 24, 2021 at 8:05 AM Dongjoon Hyun 
wrote:

> +1
>
> Bests,
> Dongjoon
>
> On Wed, Feb 24, 2021 at 5:46 AM Gabor Somogyi 
> wrote:
>
>> +1 (non-binding)
>>
>> Tested my added security related featues, found an issue but not a
>> blocker.
>>
>> On Wed, 24 Feb 2021, 09:47 Hyukjin Kwon,  wrote:
>>
>>> I remember HiveExternalCatalogVersionsSuite was flaky for a while which
>>> is fixed in
>>> https://github.com/apache/spark/commit/0d5d248bdc4cdc71627162a3d20c42ad19f24ef4
>>> and .. KafkaDelegationTokenSuite is flaky (
>>> https://issues.apache.org/jira/browse/SPARK-31250).
>>>
>>> 2021년 2월 24일 (수) 오후 5:19, Mridul Muralidharan 님이 작성:
>>>

 Signatures, digests, etc check out fine.
 Checked out tag and build/tested with -Pyarn -Phadoop-2.7 -Phive
 -Phive-thriftserver -Pmesos -Pkubernetes

 I keep getting test failures with
 * org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite
 * org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite.
 (Note: I remove $HOME/.m2 and $HOME/.iv2 paths before build)

 Removing these suites gets the build through though - does anyone have
 suggestions on how to fix it ? I did not face this with RC1.

 Regards,
 Mridul


 On Mon, Feb 22, 2021 at 12:57 AM Hyukjin Kwon 
 wrote:

> Please vote on releasing the following candidate as Apache Spark
> version 3.1.1.
>
> The vote is open until February 24th 11PM PST and passes if a majority
> +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.1.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.1.1-rc3 (commit
> 1d550c4e90275ab418b9161925049239227f3dc9):
> https://github.com/apache/spark/tree/v3.1.1-rc3
>
> The release files, including signatures, digests, etc. can be found at:
> 
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1367
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-docs/
>
> The list of bug fixes going into 3.1.1 can be found at the following
> URL:
> https://s.apache.org/41kf2
>
> This release is using the release script of the tag v3.1.1-rc3.
>
> FAQ
>
> ===
> What happened to 3.1.0?
> ===
>
> There was a technical issue during Apache Spark 3.1.0 preparation, and
> it was discussed and decided to skip 3.1.0.
> Please see
> https://spark.apache.org/news/next-official-release-spark-3.1.1.html for
> more details.
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC via "pip install
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/pyspark-3.1.1.tar.gz
> "
> and see if anything important breaks.
> In the Java/Scala, you can add the staging repository to your projects
> resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with an out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.1.1?
> ===
>
> The current list of open tickets targeted at 3.1.1 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.1.1
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the 

Re: dataFrame.na.fill() fails for column with dot

2021-02-09 Thread Terry Kim
You probably need to update f. name here
<https://github.com/apache/spark/blob/18b30107adb37d3c7a767a20cc02813f0fdb86da/sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala#L413>
as
well, but we can discuss further when you create a JIRA/PR.

Thanks,
Terry

On Tue, Feb 9, 2021 at 9:53 AM Terry Kim  wrote:

> Thanks Amandeep. This seems like a valid bug to me as quoted columns are
> not handled property for na.fill(). I think the better place to fix is in
> DataFrameNaFunctions.scala
> <https://github.com/apache/spark/blob/18b30107adb37d3c7a767a20cc02813f0fdb86da/sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala#L422>
>  where
> "f.name" should be quoted.
>
> Could you create a JIRA
> <https://issues.apache.org/jira/projects/SPARK/issues>?
>
> Thanks,
> Terry
>


Re: dataFrame.na.fill() fails for column with dot

2021-02-09 Thread Terry Kim
Thanks Amandeep. This seems like a valid bug to me as quoted columns are
not handled property for na.fill(). I think the better place to fix is in
DataFrameNaFunctions.scala

where
"f.name" should be quoted.

Could you create a JIRA
?

Thanks,
Terry


Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-26 Thread Terry Kim
Hi,

Please check if the following regression should be included:
https://github.com/apache/spark/pull/31352

Thanks,
Terry

On Tue, Jan 26, 2021 at 7:54 AM Holden Karau  wrote:

> If were ok waiting for it, I’d like to get
> https://github.com/apache/spark/pull/31298 in as well (it’s not a
> regression but it is a bug fix).
>
> On Tue, Jan 26, 2021 at 6:38 AM Hyukjin Kwon  wrote:
>
>> It looks like a cool one but it's a pretty big one and affects the plans
>> considerably ... maybe it's best to avoid adding it into 3.1.1 in
>> particular during the RC period if this isn't a clear regression that
>> affects many users.
>>
>> 2021년 1월 26일 (화) 오후 11:23, Peter Toth 님이 작성:
>>
>>> Hey,
>>>
>>> Sorry for chiming in a bit late, but I would like to suggest my PR (
>>> https://github.com/apache/spark/pull/28885) for review and inclusion
>>> into 3.1.1.
>>>
>>> Currently, invalid reuse reference nodes appear in many queries, causing
>>> performance issues and incorrect explain plans. Now that
>>> https://github.com/apache/spark/pull/31243 got merged these invalid
>>> references can be easily found in many of our golden files on master:
>>> https://github.com/apache/spark/pull/28885#issuecomment-767530441.
>>> But the issue isn't master (3.2) specific, actually it has been there
>>> since 3.0 when Dynamic Partition Pruning was added.
>>> So it is not a regression from 3.0 to 3.1.1, but in some cases (like
>>> TPCDS q23b) it is causing performance regression from 2.4 to 3.x.
>>>
>>> Thanks,
>>> Peter
>>>
>>> On Tue, Jan 26, 2021 at 6:30 AM Hyukjin Kwon 
>>> wrote:
>>>
 Guys, I plan to make an RC as soon as we have no visible issues. I have
 merged a few correctness issues. There look:
 - https://github.com/apache/spark/pull/31319 waiting for a review (I
 will do it too soon).
 - https://github.com/apache/spark/pull/31336
 - I know Max's investigating the perf regression one which hopefully
 will be fixed soon.

 Are there any more blockers or correctness issues? Please ping me or
 say it out here.
 I would like to avoid making an RC when there are clearly some issues
 to be fixed.
 If you're investigating something suspicious, that's fine too. It's
 better to make sure we're safe instead of rushing an RC without finishing
 the investigation.

 Thanks all.


 2021년 1월 22일 (금) 오후 6:19, Hyukjin Kwon 님이 작성:

> Sure, thanks guys. I'll start another RC after the fixes. Looks like
> we're almost there.
>
> On Fri, 22 Jan 2021, 17:47 Wenchen Fan,  wrote:
>
>> BTW, there is a correctness bug being fixed at
>> https://github.com/apache/spark/pull/30788 . It's not a regression,
>> but the fix is very simple and it would be better to start the next RC
>> after merging that fix.
>>
>> On Fri, Jan 22, 2021 at 3:54 PM Maxim Gekk 
>> wrote:
>>
>>> Also I am investigating a performance regression in some TPC-DS
>>> queries (q88 for instance) that is caused by a recent commit in 3.1, 
>>> highly
>>> likely in the period from 19th November, 2020 to 18th December, 2020.
>>>
>>> Maxim Gekk
>>>
>>> Software Engineer
>>>
>>> Databricks, Inc.
>>>
>>>
>>> On Fri, Jan 22, 2021 at 10:45 AM Wenchen Fan 
>>> wrote:
>>>
 -1 as I just found a regression in 3.1. A self-join query works
 well in 3.0 but fails in 3.1. It's being fixed at
 https://github.com/apache/spark/pull/31287

 On Fri, Jan 22, 2021 at 4:34 AM Tom Graves
  wrote:

> +1
>
> built from tarball, verified sha and regular CI and tests all pass.
>
> Tom
>
> On Monday, January 18, 2021, 06:06:42 AM CST, Hyukjin Kwon <
> gurwls...@gmail.com> wrote:
>
>
> Please vote on releasing the following candidate as Apache Spark
> version 3.1.1.
>
> The vote is open until January 22nd 4PM PST and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.1.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see
> http://spark.apache.org/
>
> The tag to be voted on is v3.1.1-rc1 (commit
> 53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d):
> https://github.com/apache/spark/tree/v3.1.1-rc1
>
> The release files, including signatures, digests, etc. can be
> found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
>
> 

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-20 Thread Terry Kim
+1 (non-binding)

(Also ran .NET for Apache Spark E2E tests, which touch many of DataFrame,
Function APIs)

Thanks,
Terry

On Wed, Jan 20, 2021 at 6:01 AM Jacek Laskowski  wrote:

> Hi,
>
> +1 (non-binding)
>
> 1. Built locally using AdoptOpenJDK (build 11.0.9+11) with
> -Pyarn,kubernetes,hive-thriftserver,scala-2.12 -DskipTests
> 2. Ran batch and streaming demos using Spark on Kubernetes (minikube)
> using spark-shell (client deploy mode) and spark-submit --deploy-mode
> cluster
>
> I reported a non-blocking issue with "the only developer Matei" (
> https://issues.apache.org/jira/browse/SPARK-34158)
>
> Found a minor non-blocking (but annoying) issue in Spark on k8s that's
> different from 3.0.1 that should really be silenced as the other debug
> message in ExecutorPodsAllocator:
>
> 21/01/19 12:23:26 DEBUG ExecutorPodsAllocator: ResourceProfile Id: 0 pod
> allocation status: 2 running, 0 pending. 0 unacknowledged.
> 21/01/19 12:23:27 DEBUG ExecutorPodsAllocator: ResourceProfile Id: 0 pod
> allocation status: 2 running, 0 pending. 0 unacknowledged.
> 21/01/19 12:23:28 DEBUG ExecutorPodsAllocator: ResourceProfile Id: 0 pod
> allocation status: 2 running, 0 pending. 0 unacknowledged.
> 21/01/19 12:23:29 DEBUG ExecutorPodsAllocator: ResourceProfile Id: 0 pod
> allocation status: 2 running, 0 pending. 0 unacknowledged.
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books 
> Follow me on https://twitter.com/jaceklaskowski
>
> 
>
>
> On Mon, Jan 18, 2021 at 1:06 PM Hyukjin Kwon  wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 3.1.1.
>>
>> The vote is open until January 22nd 4PM PST and passes if a majority +1
>> PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.1.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v3.1.1-rc1 (commit
>> 53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d):
>> https://github.com/apache/spark/tree/v3.1.1-rc1
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1364
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/
>>
>> The list of bug fixes going into 3.1.1 can be found at the following URL:
>> https://s.apache.org/41kf2
>>
>> This release is using the release script of the tag v3.1.1-rc1.
>>
>> FAQ
>>
>> ===
>> What happened to 3.1.0?
>> ===
>>
>> There was a technical issue during Apache Spark 3.1.0 preparation, and it
>> was discussed and decided to skip 3.1.0.
>> Please see
>> https://spark.apache.org/news/next-official-release-spark-3.1.1.html for
>> more details.
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC via "pip install
>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/pyspark-3.1.1.tar.gz
>> "
>> and see if anything important breaks.
>> In the Java/Scala, you can add the staging repository to your projects
>> resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with an out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.1.1?
>> ===
>>
>> The current list of open tickets targeted at 3.1.1 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 3.1.1
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>>
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>>


Re: [VOTE] Decommissioning SPIP

2020-07-06 Thread Terry Kim
+1 (non-binding)

Thanks,
Terry

On Wed, Jul 1, 2020 at 6:05 PM Holden Karau  wrote:

> Hi Spark Devs,
>
> I think discussion has settled on the SPIP doc at
> https://docs.google.com/document/d/1EOei24ZpVvR7_w0BwBjOnrWRy4k-qTdIlx60FsHZSHA/edit?usp=sharing
>  ,
> design doc at
> https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit,
> or JIRA https://issues.apache.org/jira/browse/SPARK-20624, and I've
> received a request to put the SPIP up for a VOTE quickly. The discussion
> thread on the mailing list is at
> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-SPIP-Graceful-Decommissioning-td29650.html
> .
>
> Normally this vote would be open for 72 hours, however since it's a long
> weekend in the US where many of the PMC members are, this vote will not
> close before July 6th at noon pacific time.
>
> The SPIP procedures are documented at:
> https://spark.apache.org/improvement-proposals.html. The ASF's voting
> guide is at https://www.apache.org/foundation/voting.html.
>
> Please vote before July 6th at noon:
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don't think this is a good idea because ...
>
> I will start the voting off with a +1 from myself.
>
> Cheers,
>
> Holden
>


[DISCUSS] Consistent relation resolution behavior in SparkSQL

2019-12-01 Thread Terry Kim
Hi all,

As discussed in SPARK-29900, Spark currently has two different relation
resolution behaviors:

   1. Look up temp view first, then table/persistent view
   2. Look up table/persistent view

The first behavior is used in SELECT, INSERT and a few commands that
support temp views such as DESCRIBE TABLE, etc. The second behavior is used
in most commands. Thus, it is hard to predict which relation resolution
rule is being applied for a given command.

I want to propose a consistent relation resolution behavior in which temp
views are always looked up first before table/persistent view, as
described more in detail in this doc: consistent relation resolution
proposal

.

Note that this proposal is a breaking change, but the impact should be
minimal since this applies only when there are temp views and tables with
the same name.

Any feedback will be appreciated.

I also want to thank Wenchen Fan, Ryan Blue, Burak Yavuz, and Dongjoon Hyun
for guidance and suggestion.

Regards,
Terry





Re: Release Apache Spark 2.4.4

2019-08-13 Thread Terry Kim
Can the following be included?

[SPARK-27234][SS][PYTHON] Use InheritableThreadLocal for current epoch in
EpochTracker (to support Python UDFs)


Thanks,
Terry

On Tue, Aug 13, 2019 at 10:24 PM Wenchen Fan  wrote:

> +1
>
> On Wed, Aug 14, 2019 at 12:52 PM Holden Karau 
> wrote:
>
>> +1
>> Does anyone have any critical fixes they’d like to see in 2.4.4?
>>
>> On Tue, Aug 13, 2019 at 5:22 PM Sean Owen  wrote:
>>
>>> Seems fine to me if there are enough valuable fixes to justify another
>>> release. If there are any other important fixes imminent, it's fine to
>>> wait for those.
>>>
>>>
>>> On Tue, Aug 13, 2019 at 6:16 PM Dongjoon Hyun 
>>> wrote:
>>> >
>>> > Hi, All.
>>> >
>>> > Spark 2.4.3 was released three months ago (8th May).
>>> > As of today (13th August), there are 112 commits (75 JIRAs) in
>>> `branch-24` since 2.4.3.
>>> >
>>> > It would be great if we can have Spark 2.4.4.
>>> > Shall we start `2.4.4 RC1` next Monday (19th August)?
>>> >
>>> > Last time, there was a request for K8s issue and now I'm waiting for
>>> SPARK-27900.
>>> > Please let me know if there is another issue.
>>> >
>>> > Thanks,
>>> > Dongjoon.
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>


FlatMapGroupsInPandasExec with multiple record batches

2019-06-11 Thread Terry Kim
Hi,

I see the following comment in FlatMapGroupsInPandasExec.scala

:
"It's possible to further split one group into multiple record batches to
reduce the memory footprint on the Java side, this is left as future work."

I checked the JIRA but could not find anything related to this. Is there a
plan to support this scenario?

Thanks,
Terry


Re: [VOTE] Release Apache Spark 2.4.2

2019-04-26 Thread Terry Kim
Very much interested in hearing what you folks decide. We currently have a
couple asking us questions at https://github.com/dotnet/spark/issues.

Thanks,
Terry



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[DISCUSS] SPIP: .NET bindings for Apache Spark

2019-02-27 Thread Terry Kim
Hi,

I have posted a SPIP to JIRA:
https://issues.apache.org/jira/browse/SPARK-27006.

I look forward to your feedback.

Thanks,
Terry