[ https://issues.apache.org/jira/browse/SPARK-32341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
gaokui updated SPARK-32341: --------------------------- Affects Version/s: 3.0.0 Description: when i use spark rdd . i often use to read kafka data.And kafka data has lots of kinds data set. I filter these rdd by kafka key , then i can use Array[rdd] to fill every topic rdd. But at that , i use rdd.filter,that will generate more than one stage.Data will process by many task, that consume too many time. And it is not necessary. i hope add multiple filter function not rdd.filter ,that will return Array[RDD] in one stage by dividing all mixture data RDD to single data set RDD . function like Array[RDD]=rdd.multiplefilter(setcondition). was: when i use spark rdd . i often use to read kafka data. but kafka data has lots of kinds data set. when i use rdd.filter,that will generate more stage. i hope add mutiple filter function not rdd.filter ,that will return in one stage with all single data set. like Array[RDD]=rdd.mutiplefilter(setcondition). Summary: add mutiple filter in rdd function (was: wish to add mutiple filter in rdd function) > add mutiple filter in rdd function > ---------------------------------- > > Key: SPARK-32341 > URL: https://issues.apache.org/jira/browse/SPARK-32341 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 2.4.6, 3.0.0 > Reporter: gaokui > Priority: Major > > when i use spark rdd . i often use to read kafka data.And kafka data has lots > of kinds data set. > I filter these rdd by kafka key , then i can use Array[rdd] to fill every > topic rdd. > But at that , i use rdd.filter,that will generate more than one stage.Data > will process by many task, that consume too many time. And it is not > necessary. > i hope add multiple filter function not rdd.filter ,that will return > Array[RDD] in one stage by dividing all mixture data RDD to single data set > RDD . > function like Array[RDD]=rdd.multiplefilter(setcondition). > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org