Github user dakirsa commented on the issue: https://github.com/apache/spark/pull/19439 @hhbyyh, @thunterdb > Not sure about the reason to include "origin" info into the image data. Based on my experience, path info > serves better as a separate column in the DataFrame. (E.g. prediction) One of the main reasons is MLlib pipelines: transformers/estimators work on a single dataframe column; so it is much easier when "origin" is a part of this column too.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org