[Spark 3.0.0] Job fails with NPE - worked in Spark 2.4.4

2020-07-23 Thread Neelesh Salian
Hi folks, Been trying to debug this issue: https://gist.github.com/nssalian/203e20432c2ed237717be28642b1871a *Context:* *The application (Pyspark):* 1. Read a Hive table from the Metastore (Running Hive 1.2.2) 2. Print schema of the Dataframe read. 3. Do a show() on the df captured. The above

Re: Spark books

2017-05-03 Thread Neelesh Salian
The Apache Spark documentation is good to begin with. All the programming guides, particularly. On Wed, May 3, 2017 at 5:07 PM, ayan guha wrote: > I would suggest do not buy any book, just start with databricks community > edition > > On Thu, May 4, 2017 at 9:30 AM, Tobi

Re: Steps to Run Spark Scala job from Oozie on EC2 Hadoop clsuter

2016-03-07 Thread Neelesh Salian
Hi Divya, This link should have the details that you need to begin using the Spark Action on Oozie: https://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html Thanks. On Mon, Mar 7, 2016 at 7:52 AM, Benjamin Kim wrote: > To comment… > > At my company, we have not

Re: Spark Performance on Yarn

2015-04-22 Thread Neelesh Salian
Does it still hit the memory limit for the container? An expensive transformation? On Wed, Apr 22, 2015 at 8:45 AM, Ted Yu yuzhih...@gmail.com wrote: In master branch, overhead is now 10%. That would be 500 MB FYI On Apr 22, 2015, at 8:26 AM, nsalian neeleshssal...@gmail.com wrote: