RE: [SPARK-23207] Repro

2019-08-09 Thread tcondie
Hi Sean, To finish the job, I did need to set spark.stage.maxConsecutiveAttempts to a large number e.g., 100; a suggestion from Jiang Xingbo. I haven't seen any recent movement/PRs on this issue, but I'll see if we can repro with a more recent version of Spark. Best regards, Tyson

[SPARK-23207] Repro

2019-08-09 Thread tcondie
Hi, We are able to reproduce this bug in Spark 2.4 using the following program: import scala.sys.process._ import org.apache.spark.TaskContext val res = spark.range(0, 1 * 1, 1).map{ x => (x % 1000, x)}.repartition(20) res.distinct.count // kill an executor in the stage

RE: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-05-12 Thread tcondie
+1 (non-binding) Tyson Condie From: Kazuaki Ishizaki Sent: Thursday, May 9, 2019 9:17 AM To: Bryan Cutler Cc: Bobby Evans ; Spark dev list ; Thomas graves Subject: Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support +1 (non-binding) Kazuaki Ishizaki

RE: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-04-19 Thread tcondie
+1 (non-binding) for better columnar data processing support. From: Jules Damji Sent: Friday, April 19, 2019 12:21 PM To: Bryan Cutler Cc: Dev Subject: Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support + (non-binding) Sent from my iPhone Pardon the

RE: Hive Hash in Spark

2019-03-07 Thread tcondie
Thanks Ryan and Reynold for the information! Cheers, Tyson From: Ryan Blue Sent: Wednesday, March 6, 2019 3:47 PM To: Reynold Xin Cc: tcon...@gmail.com; Spark Dev List Subject: Re: Hive Hash in Spark I think this was needed to add support for bucketed Hive tables. Like Tyson

Hive Hash in Spark

2019-03-06 Thread tcondie
Hi, I noticed the existence of a Hive Hash partitioning implementation in Spark, but also noticed that it's not being used, and that the Spark hash partitioning function is presently hardcoded to Murmur3. My question is whether Hive Hash is dead code or are their future plans to support

[DISCUSS] SPIP SPARK-26257

2019-01-14 Thread tcondie
Dear Spark Community, I have posted a SPIP to JIRA: https://issues.apache.org/jira/browse/SPARK-26257 I look forward to your feedback on the JIRA ticket. Best regards, Tyson

[Discuss] Language Interop for Apache Spark

2018-09-25 Thread tcondie
There seems to be some desire for third party language extensions for Apache Spark. Some notable examples include: * C#/F# from project Mobius https://github.com/Microsoft/Mobius * Haskell from project sparkle https://github.com/tweag/sparkle * Julia from project Spark.jl