Github user srowen commented on the issue: https://github.com/apache/spark/pull/21456 I guess I'm just very surprised if this single line is responsible for 10% of objects on the heap - are you sure? Can we otherwise optimize it elsewhere in the code? I am also not sure why this line would produce different paths for the same canonical path? because that's what's driving adding all this normalization logic.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org