答复: Welcome Zhenhua Wang as a Spark committer

2018-04-02 Thread wangzhenhua (G)
Thanks everyone! It’s my great pleasure to be part of such a professional and innovative community! best regards, -Zhenhua(Xander)

答复: [VOTE] [SPIP] SPARK-15689: Data Source API V2 read path

2017-09-07 Thread wangzhenhua (G)
+1 (non-binding) Great to see data source API is going to be improved! best regards, -Zhenhua(Xander) 发件人: Dongjoon Hyun [mailto:dongjoon.h...@gmail.com] 发送时间: 2017年9月8日 4:07 收件人: 蒋星博 抄送: Michael Armbrust; Reynold Xin; Andrew Ash; Herman van Hövell tot Westerflier; Ryan Blue; Spark dev list; Su

答复: Limit Query Performance Suggestion

2017-01-18 Thread wangzhenhua (G)
How about this: 1. we can make LocalLimit shuffle to mutiple partitions, i.e. create a new partitioner to uniformly dispatch the data class LimitUniformPartitioner(partitions: Int) extends Partitioner { def numPartitions: Int = partitions var num = 0 def getPartition(key: Any): Int = {

Re: Re: timeout in shuffle problem

2016-01-27 Thread wangzhenhua (G)
external shuffle service is not enabled best regards, -zhenhua From: Hamel Kothari<mailto:hamelkoth...@gmail.com> Date: 2016-01-27 22:21 To: Ted Yu<mailto:yuzhih...@gmail.com>; wangzhenhua (G)<mailto:wangzhen...@huawei.com> CC: dev<mailto

timeout in shuffle problem

2016-01-24 Thread wangzhenhua (G)
Hi, I have a problem of time out in shuffle, it happened after shuffle write and at the start of shuffle read, logs on driver and executors are shown as below. Spark version is 1.5. Looking forward to your replys. Thanks! logs on driver only have warnings: WARN TaskSetManager: Lost task 38.0 in