Re: Topics for Spark online classes & webinars

2023-03-15 Thread Denny Lee
What we can do is get into the habit of compiling the list on LinkedIn but making sure this list is shared and broadcast here, eh?! As well, when we broadcast the videos, we can do this using zoom/jitsi/ riverside.fm as well as simulcasting this on LinkedIn. This way you can view directly on the

Re: Spark StructuredStreaming - watermark not working as expected

2023-03-15 Thread karan alang
Hi Mich, this doesn't seem to be working for me .. the watermark seems to be getting ignored ! Here is the data put into Kafka : ``` +---++ |value |key |

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Mich Talebzadeh
Understood Nitin It would be wrong to act against one's conviction. I am sure we can find a way around providing the contents Regards Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Mich Talebzadeh
Hi Nitin, Linkedin is more of a professional media. FYI, I am only a member of Linkedin, no facebook, etc.There is no reason for you NOT to create a profile for yourself in linkedin :) https://www.linkedin.com/help/linkedin/answer/a1338223/sign-up-to-join-linkedin?lang=en see you there as

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Bjørn Jørgensen
Great. A case that I hope can be better documented, especially now that we have Pandas API on Spark and many potential new users coming from Pandas. Is how to start Spark with full available memory and CPU. I use this function to do this in a notebook. import multiprocessing import os import sys

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Denny Lee
Thanks Mich for tackling this! I encourage everyone to add to the list so we can have a comprehensive list of topics, eh?! On Wed, Mar 15, 2023 at 10:27 Mich Talebzadeh wrote: > Hi all, > > Thanks to @Denny Lee to give access to > > https://www.linkedin.com/company/apachespark/ > > and

Re: Topics for Spark online classes & webinars

2023-03-15 Thread Mich Talebzadeh
Hi all, Thanks to @Denny Lee to give access to https://www.linkedin.com/company/apachespark/ and contribution from @asma zgolli You will see my post at the bottom. Please add anything else on topics to the list as a comment. We will then put them together in an article perhaps. Comments

Re: logging pickle files on local run of spark.ml Pipeline model

2023-03-15 Thread Sean Owen
Pickle won't work. But the others should. I think you are specifying an invalid path in both cases but hard to say without more detail On Wed, Mar 15, 2023, 9:13 AM Mnisi, Caleb wrote: > Good Day > > > > I am having trouble saving a spark.ml Pipeline model to a pickle file, > when running

logging pickle files on local run of spark.ml Pipeline model

2023-03-15 Thread Mnisi, Caleb
Good Day I am having trouble saving a spark.ml Pipeline model to a pickle file, when running locally on my PC. I've tried a few ways to save the model: 1. mlflow.spark.log_model(artifact_path=experiment.artifact_location, spark_model= model, registered_model_name="myModel") * with