Hi,

There are too many blogs out there with absolutely no value. Before writing
another blog, which does not make much sense by doing run time comparisons
between RDD and dataframes (as stated earlier), it may be  useful to first
understand what you are trying to achieve by writing this blog.

Then perhaps based on that you may want to look at different options.


Regards,
Gourav Sengupta



On Fri, Dec 24, 2021 at 10:42 AM <bit...@bitfox.top> wrote:

> As you see below:
>
> $ pip install sparkmeasure
> Collecting sparkmeasure
>    Using cached
>
> https://files.pythonhosted.org/packages/9f/bf/c9810ff2d88513ffc185e65a3ab9df6121ad5b4c78aa8d134a06177f9021/sparkmeasure-0.14.0-py2.py3-none-any.whl
> Installing collected packages: sparkmeasure
> Successfully installed sparkmeasure-0.14.0
>
>
> $ pyspark --packages ch.cern.sparkmeasure:spark-measure_2.12:0.17
> Python 3.6.9 (default, Jan 26 2021, 15:33:00)
> [GCC 8.4.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> ..........
> >>> from sparkmeasure import StageMetrics
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> ModuleNotFoundError: No module named 'sparkmeasure'
>
>
> That doesn't work still.
> I run spark 3.2.0 on an ubuntu system.
>
> Regards.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to