Thanks Josh. Looks promising. I will give it a try.

Thanks,
Aniket

On Mon, Dec 29, 2014, 9:55 PM Josh Rosen <rosenvi...@gmail.com> wrote:

> It's accessed through the `statusTracker` field on SparkContext.
>
> *Scala*:
>
>
> https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.SparkStatusTracker
>
> *Java*:
>
>
> https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaSparkStatusTracker.html
>
> Don't create new instances of this yourself; instead, use sc.statusTracker
> to obtain the current instance.
>
> This API is missing a bunch of stuff that's available in the web UI, but
> it was designed so that we can add new methods without breaking binary
> compatibility. Although it would technically be a new feature, I'd hope
> that we can backport some additions to 1.2.1 since it's just adding a
> facade / stable interface in front of JobProgressListener and thus has
> little to no risk to introduce new bugs elsewhere in Spark.
>
>
>
> On Mon, Dec 29, 2014 at 3:08 AM, Aniket Bhatnagar <
> aniket.bhatna...@gmail.com> wrote:
>
>> Hi Josh
>>
>> Is there documentation available for status API? I would like to use it.
>>
>> Thanks,
>> Aniket
>>
>>
>> On Sun Dec 28 2014 at 02:37:32 Josh Rosen <rosenvi...@gmail.com> wrote:
>>
>>> The console progress bars are implemented on top of a new stable "status
>>> API" that was added in Spark 1.2.  It's possible to query job progress
>>> using this interface (in older versions of Spark, you could implement a
>>> custom SparkListener and maintain the counts of completed / running /
>>> failed tasks / stages yourself).
>>>
>>> There are actually several subtleties involved in implementing
>>> "job-level" progress bars which behave in an intuitive way; there's a
>>> pretty extensive discussion of the challenges at
>>> https://github.com/apache/spark/pull/3009.  Also, check out the pull
>>> request for the console progress bars for an interesting design discussion
>>> around how they handle parallel stages:
>>> https://github.com/apache/spark/pull/3029.
>>>
>>> I'm not sure about the plumbing that would be necessary to display live
>>> progress updates in the IPython notebook UI, though.  The general pattern
>>> would probably involve a mapping to relate notebook cells to Spark jobs
>>> (you can do this with job groups, I think), plus some periodic timer that
>>> polls the driver for the status of the current job in order to update the
>>> progress bar.
>>>
>>> For Spark 1.3, I'm working on designing a REST interface to accesses
>>> this type of job / stage / task progress information, as well as expanding
>>> the types of information exposed through the stable status API interface.
>>>
>>> - Josh
>>>
>>> On Thu, Dec 25, 2014 at 10:01 AM, Eric Friedman <
>>> eric.d.fried...@gmail.com> wrote:
>>>
>>>> Spark 1.2.0 is SO much more usable than previous releases -- many
>>>> thanks to the team for this release.
>>>>
>>>> A question about progress of actions.  I can see how things are
>>>> progressing using the Spark UI.  I can also see the nice ASCII art
>>>> animation on the spark driver console.
>>>>
>>>> Has anyone come up with a way to accomplish something similar in an
>>>> iPython notebook using pyspark?
>>>>
>>>> Thanks
>>>> Eric
>>>>
>>>
>>>
>

Reply via email to