How to display column names in spark-sql output

2015-12-11 Thread Ashwin Shankar
Hi, When we run spark-sql, is there a way to get column names/headers with the result? -- Thanks, Ashwin

Problem with pyspark on Docker talking to YARN cluster

2015-06-10 Thread Ashwin Shankar
All, I was wondering if any of you have solved this problem : I have pyspark(ipython mode) running on docker talking to a yarn cluster(AM/executors are NOT running on docker). When I start pyspark in the docker container, it binds to port *49460.* Once the app is submitted to YARN, the app(AM)

How to pass system properties in spark ?

2015-06-03 Thread Ashwin Shankar
Hi, I'm trying to use property substitution in my log4j.properties, so that I can choose where to write spark logs at runtime. The problem is that, system property passed to spark shell doesn't seem to getting propagated to log4j. *Here is log4j.properites(partial) with a parameter

Spark on Yarn : Map outputs lifetime ?

2015-05-12 Thread Ashwin Shankar
Hi, In spark on yarn and when running spark_shuffle as auxiliary service on node manager, does map spills of a stage gets cleaned up once the next stage completes OR is it preserved till the app completes(ie waits for all the stages to complete) ? -- Thanks, Ashwin

Building spark targz

2014-11-12 Thread Ashwin Shankar
Hi, I just cloned spark from the github and I'm trying to build to generate a tar ball. I'm doing : mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -DskipTests clean package Although the build is successful, I don't see the targz generated. Am I running the wrong command ? -- Thanks,

Re: Building spark targz

2014-11-12 Thread Ashwin Shankar
making sure but are you looking for the tar in assembly/target dir ? On Wed, Nov 12, 2014 at 3:14 PM, Ashwin Shankar ashwinshanka...@gmail.com wrote: Hi, I just cloned spark from the github and I'm trying to build to generate a tar ball. I'm doing : mvn -Pyarn -Phadoop-2.4 -Dhadoop.version

Multitenancy in Spark - within/across spark context

2014-10-22 Thread Ashwin Shankar
Hi Spark devs/users, One of the things we are investigating here at Netflix is if Spark would suit us for our ETL needs, and one of requirements is multi tenancy. I did read the official doc http://spark.apache.org/docs/latest/job-scheduling.html and the book, but I'm still not clear on certain

Re: Multitenancy in Spark - within/across spark context

2014-10-22 Thread Ashwin Shankar
, Oct 22, 2014 at 11:47 AM, Ashwin Shankar ashwinshanka...@gmail.com wrote: Here are my questions : 1. Sharing spark context : How exactly multiple users can share the cluster using same spark context ? That's not something you might want to do usually. In general, a SparkContext