I too would like this feature. Erik's post makes sense. However, shouldn't
the RDD also repartition itself after drop to effectively make use of
cluster resources?
On Jul 21, 2014 8:58 PM, "Andrew Ash [via Apache Spark Developers List]" <
ml-node+s1001551n7434...@n3.nabble.com> wrote:

> Personally I'd find the method useful -- I've often had a .csv file with a
> header row that I want to drop so filter it out, which touches all
> partitions anyway.  I don't have any comments on the implementation quite
> yet though.
>
>
> On Mon, Jul 21, 2014 at 8:24 AM, Erik Erlandson <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=7434&i=0>> wrote:
>
> > A few weeks ago I submitted a PR for supporting rdd.drop(n), under
> > SPARK-2315:
> > https://issues.apache.org/jira/browse/SPARK-2315
> >
> > Supporting the drop method would make some operations convenient,
> however
> > it forces computation of >= 1 partition of the parent RDD, and so it
> would
> > behave like a "partial action" that returns an RDD as the result.
> >
> > I wrote up a discussion of these trade-offs here:
> >
> >
> http://erikerlandson.github.io/blog/2014/07/20/some-implications-of-supporting-the-scala-drop-method-for-spark-rdds/
> >
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/RFC-Supporting-the-Scala-drop-Method-for-Spark-RDDs-tp7433p7434.html
>  To start a new topic under Apache Spark Developers List, email
> ml-node+s1001551n1...@n3.nabble.com
> To unsubscribe from Apache Spark Developers List, click here
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=YW5pa2V0LmJoYXRuYWdhckBnbWFpbC5jb218MXwxMzE3NTAzMzQz>
> .
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/RFC-Supporting-the-Scala-drop-Method-for-Spark-RDDs-tp7433p7436.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply via email to