Hello Owen,
Thank you for your prompt reply!
We will check it out.

best,
Atheer Alabdullatif
________________________________
From: Sean Owen <sro...@gmail.com>
Sent: Wednesday, November 24, 2021 5:06 PM
To: Atheer Alabdullatif <a.alabdulla...@lean.sa>
Cc: user@spark.apache.org <user@spark.apache.org>; Data Engineering 
<dataengineer...@lean.sa>
Subject: Re: [issue] not able to add external libs to pyspark job while using 
spark-submit

You don't often get email from sro...@gmail.com. Learn why this is 
important<http://aka.ms/LearnAboutSenderIdentification>
External Sender: be CAUTION , Particularly with links and attachments.
That's not how you add a library. From the docs: 
https://spark.apache.org/docs/latest/api/python/user_guide/python_packaging.html

On Wed, Nov 24, 2021 at 8:02 AM Atheer Alabdullatif 
<a.alabdulla...@lean.sa<mailto:a.alabdulla...@lean.sa>> wrote:
Dear Spark team,
hope my email finds you well



I am using pyspark 3.0 and facing an issue with adding external library 
[configparser] while running the job using [spark-submit] & [yarn]

issue:


import configparser
ImportError: No module named configparser
21/11/24 08:54:38 INFO util.ShutdownHookManager: Shutdown hook called

solutions I tried:

1- installing library src files and adding it to the session using [addPyFile]:

  *   files structure:

-- main dir
   -- subdir
      -- libs
         -- configparser-5.1.0
            -- src
               -- configparser.py
         -- configparser.zip
      -- sparkjob.py

1.a zip file:

    spark = SparkSession.builder.appName(jobname + '_' + table).config(
    "spark.mongodb.input.uri", uri +
    "." +
    table +
    "").config(
    "spark.mongodb.input.sampleSize",
    9900000).getOrCreate()

spark.sparkContext.addPyFile('/maindir/subdir/libs/configparser.zip')
df = spark.read.format("mongo").load()

1.b python file

    spark = SparkSession.builder.appName(jobname + '_' + table).config(
    "spark.mongodb.input.uri", uri +
    "." +
    table +
    "").config(
    "spark.mongodb.input.sampleSize",
    9900000).getOrCreate()

spark.sparkContext.addPyFile('maindir/subdir/libs/configparser-5.1.0/src/configparser.py')
df = spark.read.format("mongo").load()


2- using os library

def install_libs():
    '''
    this function used to install external python libs in yarn
    '''
    os.system("pip3 install configparser")

if __name__ == "__main__":

    # install libs
    install_libs()


we value your support

best,

Atheer Alabdullatif






*إشعار السرية وإخلاء المسؤولية*
هذه الرسالة ومرفقاتها معدة لاستخدام المُرسل إليه المقصود بالرسالة فقط وقد تحتوي 
على معلومات سرية أو محمية قانونياً، إن لم تكن الشخص المقصود فنرجو إخطار المُرسل 
فوراً عن طريق الرد على هذا البريد الإلكتروني وحذف الرسالة من  البريد 
الإلكتروني، وعدم إبقاء نسخ منه،  لا يجوز استخدام أو عرض أو نشر المحتوى سواء 
بشكل مباشر أو غير مباشر دون موافقة خطية مسبقة، لا تتحمل شركة لين مسؤولية 
الأضرار الناتجة عن أي فيروسات قد تحملها هذه الرسالة.



*Confidentiality & Disclaimer Notice*
This e-mail message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information 
or otherwise protected by law. If you are not the intended recipient, please 
immediately notify the sender, delete the e-mail, and do not retain any copies 
of it. It is prohibited to use, disseminate or distribute the content of this 
e-mail, directly or indirectly, without prior written consent. Lean accepts no 
liability for damage caused by any virus that may be transmitted by this Email.




Reply via email to