Hi Peter,
You can use the spark.readImages API in spark 2.3 for reading images:
https://databricks.com/blog/2018/12/10/introducing-built-in-image-data-source-in-apache-spark-2-4.html
https://blogs.technet.microsoft.com/machinelearning/2018/03/05/image-data-support-in-apache-spark/
Hello experts,
I have quick question: which API allows me to read images files or binary
files (for SparkSession.readStream) from a local/hadoop file system in
Spark 2.3?
I have been browsing the following documentations and googling for it and
didn't find a good example/documentation:
Having three modes is a lot. Why not just use ansi mode as default, and legacy
for backward compatibility? Then over time there's only the ANSI mode, which is
standard compliant and easy to understand. We also don't need to invent a
standard just for Spark.
On Thu, Sep 05, 2019 at 12:27 AM,
Hi,
Thanks much for the answers. Learning Spark every day!
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
The Internals of Spark SQL https://bit.ly/spark-sql-internals
The Internals of Spark Structured Streaming
https://bit.ly/spark-structured-streaming
The Internals of Apache
+1
To be honest I don't like the legacy policy. It's too loose and easy for
users to make mistakes, especially when Spark returns null if a function
hit errors like overflow.
The strict policy is not good either. It's too strict and stops valid use
cases like writing timestamp values to a date