Re: Add python library

2020-06-08 Thread Patrick McCarthy
I've found Anaconda encapsulates modules and dependencies and such nicely, and you can deploy all the needed .so files and such by deploying a whole conda environment. I've used this method with success:

Add python library

2020-06-06 Thread Anwar AliKhan
" > Have you looked into this article? https://medium.com/@SSKahani/pyspark-applications-dependencies-99415e0df987 " This is weird ! I was hanging out here https://machinelearningmastery.com/start-here/. When I came across this post. The weird part is I was just wondering how I can take one

Re: Add python library with native code

2020-06-06 Thread Stone Zhong
Great, thank you Masood, will look into it. Regards, Stone On Fri, Jun 5, 2020 at 7:47 PM Masood Krohy wrote: > Not totally sure it's gonna help your use case, but I'd recommend that you > consider these too: > >- pex A library and tool for >

Re: Add python library with native code

2020-06-05 Thread Masood Krohy
Not totally sure it's gonna help your use case, but I'd recommend that you consider these too: * pex A library and tool for generating .pex (Python EXecutable) files * cluster-pack cluster-pack is a library on

Re: Add python library with native code

2020-06-05 Thread Stone Zhong
Thanks Dark. Looked at that article. I think the article described approach B, let me summary both approach A and approach B A) Put libraries in a network share, mount on each node, and in your code, manually set PYTHONPATH B) In your code, manually install the necessary package using "pip install

Re: Add python library with native code

2020-06-05 Thread Dark Crusader
Hi Stone, Have you looked into this article? https://medium.com/@SSKahani/pyspark-applications-dependencies-99415e0df987 I haven't tried it with .so files however I did use the approach he recommends to install my other dependencies. I Hope it helps. On Fri, Jun 5, 2020 at 1:12 PM Stone Zhong

Add python library with native code

2020-06-05 Thread Stone Zhong
Hi, So my pyspark app depends on some python libraries, it is not a problem, I pack all the dependencies into a file libs.zip, and then call *sc.addPyFile("libs.zip")* and it works pretty well for a while. Then I encountered a problem, if any of my library has any binary file dependency (like