[jira] [Created] (SPARK-42280) add spark.yarn.archive/jars similar option for spark on K8S

2023-02-01 Thread Xianjin YE (Jira)
Xianjin YE created SPARK-42280: -- Summary: add spark.yarn.archive/jars similar option for spark on K8S Key: SPARK-42280 URL: https://issues.apache.org/jira/browse/SPARK-42280 Project: Spark

[jira] [Created] (SPARK-37176) JsonSource's infer should have the same exception handle logic as JacksonParser's parse logic

2021-11-01 Thread Xianjin YE (Jira)
Xianjin YE created SPARK-37176: -- Summary: JsonSource's infer should have the same exception handle logic as JacksonParser's parse logic Key: SPARK-37176 URL: https://issues.apache.org/jira/browse/SPARK-37176

[jira] [Commented] (SPARK-34705) Add code-gen for all join types of sort merge join

2021-04-28 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334830#comment-17334830 ] Xianjin YE commented on SPARK-34705: [~chengsu] could you share some number of the CPU performance

[jira] [Commented] (SPARK-32165) SessionState leaks SparkListener with multiple SparkSession

2021-01-06 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17259777#comment-17259777 ] Xianjin YE commented on SPARK-32165: > could we add a bit more detail here as to what problem is or

[jira] [Created] (SPARK-33756) BytesToBytesMap's iterator hasNext method should be idempotent.

2020-12-11 Thread Xianjin YE (Jira)
Xianjin YE created SPARK-33756: -- Summary: BytesToBytesMap's iterator hasNext method should be idempotent. Key: SPARK-33756 URL: https://issues.apache.org/jira/browse/SPARK-33756 Project: Spark

[jira] [Created] (SPARK-32165) SessionState leaks SparkListener with multiple SparkSession

2020-07-02 Thread Xianjin YE (Jira)
Xianjin YE created SPARK-32165: -- Summary: SessionState leaks SparkListener with multiple SparkSession Key: SPARK-32165 URL: https://issues.apache.org/jira/browse/SPARK-32165 Project: Spark

[jira] [Commented] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2020-05-08 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102458#comment-17102458 ] Xianjin YE commented on SPARK-24193: I used `df.rdd.collect` intentionally to trigger the problem as

[jira] [Commented] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2020-05-08 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102285#comment-17102285 ] Xianjin YE commented on SPARK-24193: Hi, [~jinxing6...@126.com] [~cloud_fan] the fallback config has

[jira] [Commented] (SPARK-28945) Allow concurrent writes to different partitions with dynamic partition overwrite

2019-09-15 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930201#comment-16930201 ] Xianjin YE commented on SPARK-28945: >  Hi, I think the exception shown in the

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928650#comment-16928650 ] Xianjin YE commented on SPARK-29037: > About output check, I think it is not appropriate, because

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928561#comment-16928561 ] Xianjin YE commented on SPARK-29037: [~hzfeiwang] by rerun the application, do you mean re-submit

[jira] [Commented] (SPARK-28945) Allow concurrent writes to different partitions with dynamic partition overwrite

2019-09-05 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923226#comment-16923226 ] Xianjin YE commented on SPARK-28945: Yeah, I can work on this.  > Allow concurrent writes to

[jira] [Created] (SPARK-28907) Review invalid usage of new Configuration()

2019-08-29 Thread Xianjin YE (Jira)
Xianjin YE created SPARK-28907: -- Summary: Review invalid usage of new Configuration() Key: SPARK-28907 URL: https://issues.apache.org/jira/browse/SPARK-28907 Project: Spark Issue Type:

[jira] [Created] (SPARK-28573) Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for partitioned table

2019-07-30 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-28573: -- Summary: Convert InsertIntoTable(HiveTableRelation) to Datasource inserting for partitioned table Key: SPARK-28573 URL: https://issues.apache.org/jira/browse/SPARK-28573

[jira] [Created] (SPARK-28203) PythonRDD should respect SparkContext's conf when passing user confMap

2019-06-28 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-28203: -- Summary: PythonRDD should respect SparkContext's conf when passing user confMap Key: SPARK-28203 URL: https://issues.apache.org/jira/browse/SPARK-28203 Project: Spark

[jira] [Created] (SPARK-27775) Support multiple return values for udf

2019-05-20 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-27775: -- Summary: Support multiple return values for udf Key: SPARK-27775 URL: https://issues.apache.org/jira/browse/SPARK-27775 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-26713) PipedRDD may holds stdin writer and stdout read threads even if the task is finished

2019-01-23 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750836#comment-16750836 ] Xianjin YE commented on SPARK-26713: I have fixed and tested this issue in our internal cluster,

[jira] [Created] (SPARK-26713) PipedRDD may holds stdin writer and stdout read threads even if the task is finished

2019-01-23 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-26713: -- Summary: PipedRDD may holds stdin writer and stdout read threads even if the task is finished Key: SPARK-26713 URL: https://issues.apache.org/jira/browse/SPARK-26713

[jira] [Commented] (SPARK-19256) Hive bucketing support

2018-11-27 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700089#comment-16700089 ] Xianjin YE commented on SPARK-19256: [~chengsu] great news. I can do some review work then. > Hive

[jira] [Commented] (SPARK-26155) Spark SQL performance degradation after apply SPARK-21052 with Q19 of TPC-DS in 3TB scale

2018-11-27 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700054#comment-16700054 ] Xianjin YE commented on SPARK-26155: cc [~cloud_fan]. do you confirm this problem? Are there any

[jira] [Commented] (SPARK-24388) EventLoop's run method don't handle fatal error, causes driver hang forever

2018-05-25 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490373#comment-16490373 ] Xianjin YE commented on SPARK-24388: I am working on this and will send a pr soon. > EventLoop's run

[jira] [Created] (SPARK-24388) EventLoop's run method don't handle fatal error, causes driver hang forever

2018-05-25 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-24388: -- Summary: EventLoop's run method don't handle fatal error, causes driver hang forever Key: SPARK-24388 URL: https://issues.apache.org/jira/browse/SPARK-24388 Project:

[jira] [Commented] (SPARK-24293) Serialized shuffle supports mapSideCombine

2018-05-16 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477701#comment-16477701 ] Xianjin YE commented on SPARK-24293: Ah, I was under the impression that unsafe shuffle was part of

[jira] [Created] (SPARK-24293) Serialized shuffle supports mapSideCombine

2018-05-16 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-24293: -- Summary: Serialized shuffle supports mapSideCombine Key: SPARK-24293 URL: https://issues.apache.org/jira/browse/SPARK-24293 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20087) Include accumulators / taskMetrics when sending TaskKilled to onTaskEnd listeners

2018-04-25 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451732#comment-16451732 ] Xianjin YE commented on SPARK-20087: cc [~jiangxb1987] [~irashid], I am going to send a new pr if you

[jira] [Commented] (SPARK-19256) Hive bucketing support

2018-04-24 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451689#comment-16451689 ] Xianjin YE commented on SPARK-19256: Hi [~tejasp] [~cloud_fan], are you still working on this? We

[jira] [Commented] (SPARK-24006) ExecutorAllocationManager.onExecutorAdded is an O(n) operation

2018-04-18 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442781#comment-16442781 ] Xianjin YE commented on SPARK-24006: Could you leave this issue open for a while? In my company, we

[jira] [Commented] (SPARK-24006) ExecutorAllocationManager.onExecutorAdded is an O(n) operation

2018-04-18 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442224#comment-16442224 ] Xianjin YE commented on SPARK-24006: Haven't launch a large enough job to confirm my assumption..

[jira] [Created] (SPARK-24006) ExecutorAllocationManager.onExecutorAdded is an O(n) operation

2018-04-17 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-24006: -- Summary: ExecutorAllocationManager.onExecutorAdded is an O(n) operation Key: SPARK-24006 URL: https://issues.apache.org/jira/browse/SPARK-24006 Project: Spark

[jira] [Updated] (SPARK-24006) ExecutorAllocationManager.onExecutorAdded is an O(n) operation

2018-04-17 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianjin YE updated SPARK-24006: --- Description: The ExecutorAllocationManager.onExecutorAdded is an O(n) operations, I believe it will

[jira] [Commented] (SPARK-23040) BlockStoreShuffleReader's return Iterator isn't interruptible if aggregator or ordering is specified

2018-01-11 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321957#comment-16321957 ] Xianjin YE commented on SPARK-23040: I will send a pr soon > BlockStoreShuffleReader's return

[jira] [Created] (SPARK-23040) BlockStoreShuffleReader's return Iterator isn't interruptible if aggregator or ordering is specified

2018-01-11 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-23040: -- Summary: BlockStoreShuffleReader's return Iterator isn't interruptible if aggregator or ordering is specified Key: SPARK-23040 URL: https://issues.apache.org/jira/browse/SPARK-23040

[jira] [Commented] (SPARK-22952) Deprecate stageAttemptId in favour of stageAttemptNumber

2018-01-03 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310672#comment-16310672 ] Xianjin YE commented on SPARK-22952: I will send pr soon > Deprecate stageAttemptId in favour of

[jira] [Created] (SPARK-22952) Deprecate stageAttemptId in favour of stageAttemptNumber

2018-01-03 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-22952: -- Summary: Deprecate stageAttemptId in favour of stageAttemptNumber Key: SPARK-22952 URL: https://issues.apache.org/jira/browse/SPARK-22952 Project: Spark Issue

[jira] [Created] (SPARK-22897) Expose stageAttemptId in TaskContext

2017-12-24 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-22897: -- Summary: Expose stageAttemptId in TaskContext Key: SPARK-22897 URL: https://issues.apache.org/jira/browse/SPARK-22897 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-6030) SizeEstimator gives wrong result for Integer object on 64bit JVM with UseCompressedOops on

2015-02-25 Thread Xianjin YE (JIRA)
Xianjin YE created SPARK-6030: - Summary: SizeEstimator gives wrong result for Integer object on 64bit JVM with UseCompressedOops on Key: SPARK-6030 URL: https://issues.apache.org/jira/browse/SPARK-6030