Thanks for the email, Erik. The Scala collection library implementation is a complicated beast ...
On Sat, Sep 6, 2014 at 8:27 AM, Erik Erlandson <e...@redhat.com> wrote: > I tripped over this recently while preparing a solution for SPARK-3250 > (efficient sampling): > > Iterator 'drop' method has a complexity bug causing quadratic behavior > https://issues.scala-lang.org/browse/SI-8835 > > It's something of a corner case, as the impact is serious only if one is > repeatedly invoking 'drop' on Iterator[T], and it doesn't apply to all > iterator subclasses (e.g. Array().iterator). It's actually a bug in > 'slice', and so invocations of 'drop', 'take' and 'slice' are potential > vulnerabilities. > > Not something I'd expect to show up frequently, but it seemed worth > mentioning, as RDD partitions are ubiquitously presented as Iterator[T] in > the compute model, and if it does happen it turns a linear algorithm into > something having quadratic behavior. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >