Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-27 Thread Henry Robinson
On Mon, 27 Aug 2018 at 13:04, Ankur Gupta wrote: > Thanks all for your responses. > > So I believe a solution that accomplishes the following will be a good > solution: > > 1. Writes logs to Hdfs asynchronously > In the limit, this could perform just as slowly at shutdown time as synchronous

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-27 Thread Ankur Gupta
Thanks all for your responses. So I believe a solution that accomplishes the following will be a good solution: 1. Writes logs to Hdfs asynchronously 2. Writes logs at INFO level while ensuring that console logs are written at WARN level by default (in shell mode) 3. Optionally, moves this file

Re: SparkContext singleton get w/o create?

2018-08-27 Thread Andrew Melo
Hi, I'm a long-time listener, first-time committer to spark, so this is good to get my feet wet. I'm particularly interested in SPARK-23836, which is an itch I may want to dive into and scratch myself in the next month or so since it's pretty painful for our use-case. Thanks! Andrew On Mon, Aug

Re: SparkContext singleton get w/o create?

2018-08-27 Thread Holden Karau
Sure, I don't think you should wait on that being merged in. If you want to take the JIRA go ahead (although if you're already familiar with the Spark code base it might make sense to leave it as a starter issue for someone who is just getting started). On Mon, Aug 27, 2018 at 12:18 PM Andrew

Re: SparkContext singleton get w/o create?

2018-08-27 Thread Andrew Melo
Hi Holden, I'm agnostic to the approach (though it seems cleaner to have an explicit API for it). If you would like, I can take that JIRA and implement it (should be a 3-line function). Cheers Andrew On Mon, Aug 27, 2018 at 2:14 PM, Holden Karau wrote: > Seems reasonable. We should probably

Re: SparkContext singleton get w/o create?

2018-08-27 Thread Holden Karau
Seems reasonable. We should probably add `getActiveSession` to the PySpark API (filed a starter JIRA https://issues.apache.org/jira/browse/SPARK-25255 ) On Mon, Aug 27, 2018 at 12:09 PM Andrew Melo wrote: > Hello Sean, others - > > Just to confirm, is it OK for client applications to access >

Re: SparkContext singleton get w/o create?

2018-08-27 Thread Andrew Melo
Hello Sean, others - Just to confirm, is it OK for client applications to access SparkContext._active_spark_context, if it wraps the accesses in `with SparkContext._lock:`? If that's acceptable to Spark, I'll implement the modifications in the Jupyter extensions. thanks! Andrew On Tue, Aug 7,

Re: no logging in pyspark code?

2018-08-27 Thread Imran Rashid
ah, great, thanks! sorry I missed that, I'll watch that jira. On Mon, Aug 27, 2018 at 12:41 PM Ilan Filonenko wrote: > A JIRA has been opened up on this exact topic: SPARK-25236 > , a few days ago, > after seeing another case of print(_,

Re: no logging in pyspark code?

2018-08-27 Thread Ilan Filonenko
A JIRA has been opened up on this exact topic: SPARK-25236 , a few days ago, after seeing another case of print(_, file=sys.stderr) in a most recent review. I agree that we should include logging for PySpark workers. On Mon, Aug 27, 2018 at 1:29

no logging in pyspark code?

2018-08-27 Thread Imran Rashid
Another question on pyspark code -- how come there is no logging at all? does python logging have an unreasonable overhead, or its impossible to configure or something? I'm really surprised nobody has ever wanted to me able to turn on some debug or trace logging in pyspark by just configuring a

Why is View logical operator not a UnaryNode explicitly?

2018-08-27 Thread Jacek Laskowski
Hi, I've just come across View logical operator which is not a UnaryNode explicitly, i.e. "extends UnaryNode". Why? https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala?utf8=%E2%9C%93#L460-L463