It won't be very efficient but you could write a python UDF using
PythonMagick - https://wiki.python.org/moin/ImageMagick
If you have PyArrow > 0.10 then you might be able to get a boost by saving
images in a column as BinaryType and writing a PandasUDF.
On Wed, Jul 31, 2019 at 6:22 AM Nick
Any other way of resizing the image before creating the DataFrame in Spark?
I know opencv does it. But I don't have opencv on my cluster. I have
Anaconda python packages installed on my cluster.
Any ideas will be appreciated. Thank you!
On Tue, Jul 30, 2019, 4:17 PM Nick Dawes wrote:
> Hi
>
>
Hi
I'm new to spark image data source.
After creating a dataframe using Spark's image data source, I would like to
resize the images in PySpark.
df = spark.read.format("image").load(imageDir)
Can you please help me with this?
Nick