[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2020-04-21 Thread Marcelo Masiero Vanzin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089190#comment-17089190
 ] 

Marcelo Masiero Vanzin commented on SPARK-1537:
---

Well, the only thing to start with is the existing SHS code. 
EventLoggingListener + FsHistoryProvider.

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2020-04-21 Thread Daniel Templeton (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089159#comment-17089159
 ] 

Daniel Templeton commented on SPARK-1537:
-

Thanks for the response, [~vanzin].  Yeah, I think we would be interested in 
exploring the work involved to do the integration.  We're in the process of 
introducing Spark into Hadoop clusters that primarily run Scalding today.  
We're using ATSv2 as the store for all of the Scalding metrics, so it would 
make sense for us to do the same with Spark.

Any required reading that we should do as we decide how best to tackle this?  
Pointers, tips, tricks, potholes, or any other info would be welcome.  Thanks!

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2020-04-21 Thread Marcelo Masiero Vanzin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089137#comment-17089137
 ] 

Marcelo Masiero Vanzin commented on SPARK-1537:
---

[~templedf] sorry forgot to reply.

ATSv1 wasn't a good match for this, and by the time ATSv2 was developed, 
interest in this feature had long lost traction in the Spark community. So this 
was closed.

Also you probably can do this without requiring the code to live in Spark.

But if you actually want to contribute the integration, there's nothing 
preventing you from opening a new bug and posting a PR.

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2020-04-17 Thread Daniel Templeton (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085922#comment-17085922
 ] 

Daniel Templeton commented on SPARK-1537:
-

[~vanzin], is that because the community has invested instead in making SHS 
that central metrics store and UI?  What about clusters with mixed workloads?

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2016-01-01 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076344#comment-15076344
 ] 

Apache Spark commented on SPARK-1537:
-

User 'steveloughran' has created a pull request for this issue:
https://github.com/apache/spark/pull/10545

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Vanzin
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-10-20 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965230#comment-14965230
 ] 

Apache Spark commented on SPARK-1537:
-

User 'steveloughran' has created a pull request for this issue:
https://github.com/apache/spark/pull/9182

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Vanzin
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-09-14 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14743421#comment-14743421
 ] 

Apache Spark commented on SPARK-1537:
-

User 'steveloughran' has created a pull request for this issue:
https://github.com/apache/spark/pull/8744

> Add integration with Yarn's Application Timeline Server
> ---
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Reporter: Marcelo Vanzin
> Attachments: SPARK-1537.txt, spark-1573.patch
>
>
> It would be nice to have Spark integrate with Yarn's Application Timeline 
> Server (see YARN-321, YARN-1530). This would allow users running Spark on 
> Yarn to have a single place to go for all their history needs, and avoid 
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
> although there is still some ongoing work. But the basics are there, and I 
> wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-06-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579152#comment-14579152
 ] 

Steve Loughran commented on SPARK-1537:
---

Full application log. Application hasn't actually stopped, which is interesting.
{code}
$ dist/bin/spark-submit  \
--class 
org.apache.spark.examples.SparkPi \
--properties-file 
../clusterconfigs/clusters/devix/spark/spark-defaults.conf \
--master yarn-client \
--executor-memory 128m \
--num-executors 1 \
--executor-cores 1 \
--driver-memory 128m \

dist/lib/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar 12
2015-06-09 17:01:59,596 [main] INFO  spark.SparkContext 
(Logging.scala:logInfo(59)) - Running Spark version 1.5.0-SNAPSHOT
2015-06-09 17:02:01,309 [sparkDriver-akka.actor.default-dispatcher-2] INFO  
slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-06-09 17:02:01,359 [sparkDriver-akka.actor.default-dispatcher-2] INFO  
Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-06-09 17:02:01,542 [sparkDriver-akka.actor.default-dispatcher-2] INFO  
Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on 
addresses :[akka.tcp://sparkDriver@192.168.1.86:51476]
2015-06-09 17:02:01,549 [main] INFO  util.Utils (Logging.scala:logInfo(59)) - 
Successfully started service 'sparkDriver' on port 51476.
2015-06-09 17:02:01,568 [main] INFO  spark.SparkEnv (Logging.scala:logInfo(59)) 
- Registering MapOutputTracker
2015-06-09 17:02:01,587 [main] INFO  spark.SparkEnv (Logging.scala:logInfo(59)) 
- Registering BlockManagerMaster
2015-06-09 17:02:01,831 [main] INFO  spark.HttpServer 
(Logging.scala:logInfo(59)) - Starting HTTP Server
2015-06-09 17:02:01,891 [main] INFO  util.Utils (Logging.scala:logInfo(59)) - 
Successfully started service 'HTTP file server' on port 51477.
2015-06-09 17:02:01,905 [main] INFO  spark.SparkEnv (Logging.scala:logInfo(59)) 
- Registering OutputCommitCoordinator
2015-06-09 17:02:02,038 [main] INFO  util.Utils (Logging.scala:logInfo(59)) - 
Successfully started service 'SparkUI' on port 4040.
2015-06-09 17:02:02,039 [main] INFO  ui.SparkUI (Logging.scala:logInfo(59)) - 
Started SparkUI at http://192.168.1.86:4040
2015-06-09 17:02:03,071 [main] INFO  spark.SparkContext 
(Logging.scala:logInfo(59)) - Added JAR 
file:/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark/dist/lib/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar
 at 
http://192.168.1.86:51477/jars/spark-examples-1.5.0-SNAPSHOT-hadoop2.6.0.jar 
with timestamp 1433865723062
2015-06-09 17:02:03,691 [main] INFO  impl.TimelineClientImpl 
(TimelineClientImpl.java:serviceInit(285)) - Timeline service address: 
http://devix.cotham.uk:8188/ws/v1/timeline/
2015-06-09 17:02:03,808 [main] INFO  client.RMProxy 
(RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at 
devix.cotham.uk/192.168.1.134:8050
2015-06-09 17:02:04,577 [main] INFO  yarn.Client (Logging.scala:logInfo(59)) - 
Requesting a new application from cluster with 1 NodeManagers
2015-06-09 17:02:04,637 [main] INFO  yarn.Client (Logging.scala:logInfo(59)) - 
Verifying our application has not requested more than the maximum memory 
capability of the cluster (2048 MB per container)
2015-06-09 17:02:04,637 [main] INFO  yarn.Client (Logging.scala:logInfo(59)) - 
Will allocate AM container, with 896 MB memory including 384 MB overhead
2015-06-09 17:02:04,638 [main] INFO  yarn.Client (Logging.scala:logInfo(59)) - 
Setting up container launch context for our AM
2015-06-09 17:02:04,643 [main] INFO  yarn.Client (Logging.scala:logInfo(59)) - 
Preparing resources for our AM container
2015-06-09 17:02:05,096 [main] WARN  shortcircuit.DomainSocketFactory 
(DomainSocketFactory.java:init(116)) - The short-circuit local reads feature 
cannot be used because libhadoop cannot be loaded.
2015-06-09 17:02:05,106 [main] DEBUG yarn.YarnSparkHadoopUtil 
(Logging.scala:logDebug(63)) - delegation token renewer is: 
rm/devix.cotham.uk@COTHAM
2015-06-09 17:02:05,107 [main] INFO  yarn.YarnSparkHadoopUtil 
(Logging.scala:logInfo(59)) - getting token for namenode: 
hdfs://devix.cotham.uk:8020/user/stevel/.sparkStaging/application_1433777033372_0005
2015-06-09 17:02:06,129 [main] DEBUG yarn.Client (Logging.scala:logDebug(63)) - 
HiveMetaStore configured in localmode
2015-06-09 17:02:06,130 [main] DEBUG yarn.Client (Logging.scala:logDebug(63)) - 
HBase Class not found: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.HBaseConfiguration
2015-06-09 17:02:06,225 

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541589#comment-14541589
 ] 

Steve Loughran commented on SPARK-1537:
---

+ YARN-3539 is resolved; the [v1 timeline 
|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md#Timeline_Server_REST_API_v1]
 is now defined and declared one of the supported REST APIs.

I'm also removing YARN-2423 as a dependency; the latest patch does this itself

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-04-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492303#comment-14492303
 ] 

Steve Loughran commented on SPARK-1537:
---

HADOOP-11826 patches the hadoop compatibility document to add timeline server 
to the list of stable APIs.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-04-08 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485258#comment-14485258
 ] 

Apache Spark commented on SPARK-1537:
-

User 'steveloughran' has created a pull request for this issue:
https://github.com/apache/spark/pull/5423

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-03-29 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385890#comment-14385890
 ] 

Steve Loughran commented on SPARK-1537:
---

# I've just tried to see where YARN-2444 stands; I can't replicate it in trunk 
but I've submitted the tests to verify that it isn't there.
# for YARN-2423 Spark seems kind of trapped. It needs an api tagged as 
public/stable; Robert's patch has the API, except it's being rejected on the 
basis that ATSv2 will break it. So it can't be tagged as stable. So there's 
no API for GET operations until some undefined time {{t1   now()}} —and then, 
only for Hadoop versions with it. Which implies it won't get picked up by Spark 
for a long time.

I think we need to talk to the YARN dev team and see what can be done here. 
Even if there's no API client bundled into YARN, unless the v1 API and its 
paths beginning {{/ws/v1/timeline/}} are going to go away, then a REST client 
is possible; it may just have to be done spark-side, where at least it can be 
made resilient to hadoop versions. 


 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329664#comment-14329664
 ] 

Zhan Zhang commented on SPARK-1537:
---

[~vanzin] If you don't have bandwidth, or don't know how to move forward with 
this JIRA after a long time. I don't mind to take it over.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329649#comment-14329649
 ] 

Zhan Zhang commented on SPARK-1537:
---

[~vanzin] Thanks for the comments.  I  don't understand you keep saying my 
code does not have many differences form your code.   We are working for 
apache project, and we all follow apache policy. Here is the link for apache 
license details:
http://www.apache.org/licenses/LICENSE-2.0.

As I request several times, why not post your workable patch and design. I will 
explain to you clearly what's the major difference of  the core design of my 
code from yours . The patch size is small, and the design is not so 
complicated, but I am sure to show you where those core design come from.

After you post your design and code, we can start from there.

Thanks.

Zhan Zhang


 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329660#comment-14329660
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

Hi [~zzhan],

I already posted the link to my code in this bug several times. The reason why 
I haven't sent a PR is the exact reason I raised about your spec and your 
patch: it uses private Yarn APIs. I've said this several times, and I really 
don't understand what part of it you don't understand. Pardon me if I haven't 
been clear about it.

Also note how there's Yarn bug in the list of blocker bugs for this one. That's 
because my p.o.c. code depends on that bug to be fixed before it can move 
forward. If you have a design that is not blocked by that code, and does not 
use internal APIs, feel free to remove the link and post it.

Here's the link to the comment with the link to my code, dated August '14:
https://issues.apache.org/jira/browse/SPARK-1537?focusedCommentId=14088438page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14088438

A link you have already seen, since you used parts of that code in your patch.

So please, can you reply to my actual comments instead of keep going back to 
this issue? My comments have nothing to do with the fact that I've written a 
p.o.c. for this feature. They're issues that exist in your spec and your code 
independent of anything I've done.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329678#comment-14329678
 ] 

Zhan Zhang commented on SPARK-1537:
---

[~vanzin] I declare integrate your code from the first submission of PR. Do 
you want to count how many times you keeping saying this? 

 Here's the link to the comment with the link to my code, dated August '14.  
Now spark is under the vote for 1.3, and today is 2/20/2015.  Is it so 
difficult submit a workable patch and design doc? 

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329739#comment-14329739
 ] 

Sean Owen commented on SPARK-1537:
--

[~zzhan] You have provided a patch as a PR right? anyone can try it. Request 
granted.

Given the YARN JIRAs already referenced here, some of which have patches ready 
to go too, I think it has been discussed in YARN too? What isn't happening with 
YARN that should be, and, can you help with it? I'm not sure if that's where 
you are saying the waiting is. That is: hasn't this been blocked on YARN 
changes for a long time?

I get it, one person's 'outstanding bug' is another's 'will not fix' but that's 
the give and take of OSS. If you want this feature in Spark, and people are 
asking that it should depend on some YARN changes -- then what do you think 
about lobbying for those YARN changes? or do you disagree that they're 
necessary, and can you argue that here please?

I don't understand your second reply. Yes, it sounds like two people have a 
similar solution with a similar problem with YARN APIs. You say you're not 
waiting on code now, but have repeatedly asked Marcelo to share some (other?) 
code. It's odd since, yes, it's very clear you acknowledge you've already seen 
his code and reused a bit, which is entirely fine. I hope we're done with that 
exchange.

I sense some insinuation that code is being 'hidden' in bad faith, but I can't 
figure out the conspiracy. I see every willingness to make *your* change alone 
here, if you propose something that addresses the YARN issues raised here. You 
are *not* blocked on anyone else's patch. However all of us are 'blocked' on 
the consensus of community / committers that care about this issue, and it 
looks like the response is clear so far: not until YARN API stuff is sorted out 
one way or the other.

Are you suggesting this patch should be committed without the YARN changes? or 
that you're working on the YARN changes? what do you want to take over and do 
next?

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329691#comment-14329691
 ] 

Sean Owen commented on SPARK-1537:
--

[~zzhan] I also can't figure out what you are suggesting here. You have 
proposed a patch, and you've been given feedback with specific reasons it 
shouldn't be committed to Spark. I agree with those, FWIW, thought I think they 
can be overcome soon. I assume others agree, given the silence (?). You haven't 
responded to these specific points. As it stands I think that's your answer: 
these YARN issues need to be addressed -- either fixed or agreed to be not an 
issue.

Nobody needs to 'take over'. I'm not clear why you think you have been waiting 
on something or someone to give you code. Right now the only thing this is 
waiting on is for you or [~zjshen] or anyone to address the YARN API issues. 
Rather than keep the broken record going, why not address the YARN API issues 
highlighted here? sorry, the answer may be that you can't commit this patch you 
want to by yourself but that's just how OSS works.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329700#comment-14329700
 ] 

Zhan Zhang commented on SPARK-1537:
---

[~sowen] From the whole context, I believe you understand what happened here. 
Let's be professional. 

My request is if someone want to try this alpha feature, we can provide a 
patch at least so that people can give it a try. Even if it cannot go upstream 
due to various reasons. 

Due to Yarn block, we should discuss with the yarn community, instead of filing 
a bug and wait forever. 

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329704#comment-14329704
 ] 

Zhan Zhang commented on SPARK-1537:
---

[~sowen] By the way, I am not waiting for someone to give me the patch. It is 
because someone declare the patch is almost ready half year ago. After I submit 
mine, then some one keep saying my patch is not much different from his.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329828#comment-14329828
 ] 

Zhan Zhang commented on SPARK-1537:
---

[~sowen] In JIRA, we share the code so that other people can comment and 
review. I am not waiting for patch. But It is hard to comment or review patch 
given a hyper-link. 

I never think to make my change alone. Actually from the beginning I 
acknowledge his contribution, and don't mind closing my PR and help to review 
his at all if you follow the PR record.  Do you agree?

You mention you sense some insinuation and conspiracy. I didn't sense it. Can 
you please educate me if you figure it out?

Let's go back to technical: Overall, it is early adoption for timeline service. 
It is alpha feature, but most functionality is working although with some 
walkaround.

REST client: Currently Timeline client does not provide retrieve API. So we 
walk around with the similar approach to the timeclient its own implementation. 
 This needs to be changed after timeline component provide more mature API.

Read overhead and scalability: The effort is in the roadmap in yarn timeline 
service.  This is a critical  feature to use timeline service. Current HDFS 
approach in spark may not scalable due to similar reason (point me out if I am 
wrong), and timeline service may be more promising, although it is not there 
yet.

Security: The security is handled transparently in timeline client.   

ACL: Timeline has ACL control as in hadoop-2.6, and client can create and set 
domain with R/W so that control the permission. 





 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329852#comment-14329852
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

Hi [~zhzhan],

bq. But It is hard to comment or review patch given a hyper-link. 

Perhaps you're not familiar with all of Github's features, but you can click on 
each individual commit and comment on the code right there, just like you can 
on a PR created from those commits. Even if that doesn't sound very appealing, 
it's not hard to copy  paste the code and comment here if you really want to. 
Or generate a downloadable diff from the commits (just add .diff at the end 
of the commit URL, e.g. 
https://github.com/vanzin/spark/commit/c1365e0de264daa015c61a2248c80dfdea705786.diff).

bq. REST client: Currently Timeline client does not provide retrieve API.

That's the main reason why this feature hasn't moved forward. Using internal 
APIs to achieve that is something we're not willing to do in Spark, because it 
exposes us to future breakages and makes compatibility harder to maintain (just 
look at what has been done for Hive). So we either need the new API in Yarn, or 
we need to invest time to create a client API that does not use Yarn's classes.

bq. ACL: Timeline has ACL control as in hadoop-2.6

I'll believe you here since I haven't looked at that code yet. But it seems 
like it requires work on the client side, which is not currently covered in 
your spec.
bq. Read overhead and scalability: The effort is in the roadmap in yarn 
timeline service. This is a critical feature to use timeline service. Current 
HDFS approach in spark may not scalable due to similar reason

I think we're talking about different things. What I'm referring to is that the 
current code that reads from the ATS reads all events of a particular entity at 
the same time. If that entity has a large number of events, that will require a 
lot of memory on the ATS side to serialize the data, and a lot of memory on the 
Spark History Server side to deserialize it. It's orthogonal to whether the 
backing store is scalable or not.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329460#comment-14329460
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

Hi [~zzhan], thanks for uploading the document.

Reading through it, I don't see anything that is really that much different 
from my initial proof-of-concept. The points I'd like to highlight are:

- It still depends on YARN-2423, or at least on some effort to write a REST 
client that does not depend on internal Yarn classes.
- What about overhead of the read code? Large jobs with lots of tasks, or 
really long jobs such as Spark Streaming jobs, will have a really large amount 
of events. Fetching them all in one batch would require a lot of memory for 
serializing the data on both sides (ATS and History Server).
- Any security considerations? I haven't really kept up-to-date with the 
security changes in the ATS after I ran into issues with my p.o.c.; but mainly, 
does the Spark job need any special tokens to talk to the ATS when security is 
enabled? Does the ATS guarantee that only the job itself (or someone with the 
right credentials) can add events to its timeline? Or is that all handled 
transparently, somehow, by the client library?
- Does YARN-2928 affect the design in any way? I took a quick look at the data 
model, so hopefully they'll keep things backwards compatible. But it would 
kinda suck to add support for an API with a limited shelf life if that's not 
the case.


 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: SPARK-1537.txt, spark-1573.patch


 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-18 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327038#comment-14327038
 ] 

Apache Spark commented on SPARK-1537:
-

User 'zhzhan' has created a pull request for this issue:
https://github.com/apache/spark/pull/4683

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-18 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326778#comment-14326778
 ] 

Zhan Zhang commented on SPARK-1537:
---

I have sent a PR with WIP for people who are interested. 
https://github.com/apache/spark/pull/4683/files

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-11-05 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198852#comment-14198852
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

I believe with YARN-2033 and YARN-2423 I can work around YARN-2444 even if it's 
still an issue, so I'll add the dependency accordingly.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-31 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192491#comment-14192491
 ] 

Zhijie Shen commented on SPARK-1537:


bq. BTW, if you want a list of things I think are important for Spark, here are 
some quick ones:

Thanks for sharing the details, which are more helpful to clean up the puzzles 
than some big but vague statement. Let me go through the aforementioned Jiras:

* YARN-2521: I'd like to keep it open for some further client improvement, such 
as local timeline data caching, while YARN-2673 already made the client retry 
when the server temporally doesn't respond. Please note that I think it's 
pretty critical when you can't upload your data because the server is down is 
*no longer true* after YARN-2673. On the other side, At the point of view of 
the API, it should keep stable.

* YARN-2423: This is proposed to improve the Java libs by adding GET APIs. They 
are used to query data, NOT to put data. We do this to help the use case that 
the developers write Java code to implement the UI to analyze the timeline 
data. Framework integration mainly deals with PUT APIs, and the Java client 
libs are already there. Take one step back, apart from the client libs, the 
RESTful APIs are always there, which is programming language neutral, and 
useful to non-Java developers.

* YARN-2444: It's may be a bug or an improper use case. According to the 
exception, the user doesn't pass the authorization for some reason. It is 
reported for 2.5, and is probably no longer valid after we fixed a bunch of 
security issues for 2.6. We need to do more validation for this issue before a 
conclusion. Anyway, it's obviously an internal issue happening in secure mode 
only, which should not the API CHANGES.

bq. I understand it doesn't affect the client API and we can still have the 
code in,

It seems that we have the agreement that the current timeline service offering 
is not blocking the Spark integration work.


 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-31 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192502#comment-14192502
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

bq. This is proposed to improve the Java libs by adding GET APIs. They are used 
to query data, NOT to put data.

Spark needs both to put and read data, otherwise the ATS is useless for Spark. 
The current goal of Spark is to use the ATS as a store for its history data, 
since the data itself is not considered public and stable itself.

So there is no point in integration if you can only write data. (I know you can 
read data through other means, but I don't want to write a custom REST client 
just to get ATS support in.)

bq.  It is reported for 2.5, and is probably no longer valid after we fixed a 
bunch of security issues for 2.6.

I'm not sure why you say it's security-related since there nothing 
security-related in the example code I posted. And if something doesn't work in 
2.5 but works in 2.6, it means we (and by that I mean Spark) have to restrict 
our support to the versions where things work - even if the underlying API is 
exactly the same.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-31 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192538#comment-14192538
 ] 

Zhijie Shen commented on SPARK-1537:


bq. Spark needs both to put and read data

It's again a vague statement. Can you share your design detail, such that we 
can evaluate it is really necessary? 
And what is the actual way of visualizing data? And integration work is not 
just single bug fix patch, we can divide work into a sequent of sub tasks, and 
the first step is to enable Spark job to be able to putting the data into the 
timeline server. By doing this, not only Spark's only web front can visualize 
job history, it also enable the third-party tools to do Spark job analysis too.

bq. I'm not sure why you say it's security-related since there nothing 
security-related in the example code I posted. 

I said According to the exception, the user doesn't pass the authorization for 
some reason. If you don't agree on it, please post your investigation on 
YARN-2444, YARN folks will help you on this issue.

bq. if something doesn't work in 2.5 but works in 2.6,

No matter the integration with timeline service, Spark on YARN is picking 
Hadoop versions now. It doesn't make sense to ask for a feature by using an 
early version that hasn't it.


 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-31 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192561#comment-14192561
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

bq. It's again a vague statement. 

I don't know what is vague about wanting to read the data you write.

bq. Can you share your design detail

I already did way better than that, way earlier in this bug: I shared the 
actual code. For this particular question, here it is:
https://github.com/vanzin/spark/blob/yarn-timeline/yarn/timeline/src/main/scala/org/apache/spark/deploy/yarn/timeline/YarnTimelineProvider.scala

See how it reads data from the ATS? It feeds it into the Spark history server, 
where the data can be visualized. It's using Yarn internal APIs, which is 
generally bad practice.

bq.  If you don't agree on it, please post your investigation on YARN-2444, 
YARN folks will help you on this issue.

I posted the error and the code to reproduce it. I don't know what else do you 
expect from me. If you think it's an authorization issue, test it with 2.6 and 
close the bug if you believe it's fixed.

bq. No matter the integration with timeline service, Spark on YARN is picking 
Hadoop versions now. It doesn't make sense to ask for a feature by using an 
early version that hasn't it.

I'm not sure I really understood what you're trying to say here. Yes, we have 
to pick versions. We need a version that supports the features we need. Even if 
the API in 2.5 didn't change in 2.6, it seems to have bugs that prevent my 
current code from working, so there is no point in trying to integrate with 2.5 
as far as I'm concerned. And as far as I know, 2.6 hasn't been released yet. 
(BTW, my code used to work with 2.4.)

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190849#comment-14190849
 ] 

Zhijie Shen commented on SPARK-1537:


[~vanzin], thanks for introducing YARN timeline server to Spark. Let me briefly 
summarize the current status of the timeline server and answer some concerns 
here. Spark folks who are interested in this monitoring service offered by YARN 
can go ahead to YARN-1530 to read the design doc and watch the latest progress.

1. The essential functions or the timeline service have been available since 
Hadoop 2.4. Basically, the user can organize the app's history or metrics 
according to timeline data model and post it the the timeline server. Later on, 
user or admin can come back to query this information to analyze how the app 
was going. The essential APIs keep unchanged from 2.4 to the coming 2.6. There 
should *NOT* be any incompatible API changes that will block this work. 
Moreover, Keeping compatible is always in our consideration when coming up with 
new features in the following Hadoop releases.

2. It's *NOT* exactly that the timeline server is not production-ready. In 
fact, Apache Tez has already integrated the timeline server for logging the 
history information. In the coming Hadoop 2.6, MapReduce is also enabled to 
publish the history information to the timeline server, too. Moreover, within 
the scope of YARN, a built-in generic history service on top of the timeline 
service is available to YARN users to watch all kinds of apps. Hence, with 
several successful pioneer, Spark should be confident enough to take the new 
merit of YARN.

3. While YARN community is progressing quickly to improve the timeline server 
in terms of security (coming 2.6), high availability, scalability, better 
client libs and so on, it should not disturb the initial attempt for Spark to 
embrace the timeline server, but will offer better experience if Spark is 
riding on it.

If you have other issue of high priority to work on, I think [~zhazhan] will be 
able to help this integration. Thanks!

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-30 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190863#comment-14190863
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

bq. ...security (coming 2.6), high availability, scalability, better client 
libs and so on...

That's exactly my point about the ATS not being production-level quality yet. 
The current plans I'm aware of would require changes in the ATS API. Since 
Spark does not support the ATS at the moment, I'd rather have it support the 
new-and-secure-and-scalable-and-available API than the current one. Otherwise 
you'll get into the mess of having to conditionally compile code for both APIs, 
or implement part of those features into your own client code (something I've 
done in my proof-of-concept but I'd really like to avoid, because it's really 
just trying to work around limitations in the current ATS design).

So, short version of what I'm trying to say: yes, you can build something that 
talks to the current ATS. But given that it currently has shortcomings, and the 
fix for those will, as far as I know, affect the client API, I don't see the 
point in trying to push that integration at this moment when Spark already has 
a working solution for job history, just so that you'll ship code that will be 
immediately deprecated by the new ATS...

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-30 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191007#comment-14191007
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

BTW, if you want a list of things I think are important for Spark, here are 
some quick ones:

* YARN-2521 (I've sort of implemented this in my code, but would really like to 
not have to care about it)
* YARN-2423 (note how this is a new API)
* YARN-2444

YARN-2521 might be the same as YARN-2673, no? YARN-2513 is sort of interesting 
but not necessary.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-30 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191052#comment-14191052
 ] 

Zhan Zhang commented on SPARK-1537:
---

Yarn-2521 can make client easier to use, but not critical. Some application 
logic make the client cache difficult to be generic.
Yarn-2444 may be already obsolete. 

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-30 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191060#comment-14191060
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

I think it's pretty critical when you can't upload your data because the server 
is down; it means we can't really recommend using the current ATS because it's 
not reliable. I understand it doesn't affect the client API and we can still 
have the code in, but it's an important feature that seems to be missing.

YARN-2423, though, is really something that can't be done today without poking 
into private Yarn classes or writing a bunch of extra code. I really wouldn't 
want to have to support any of those two options in Spark.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-10-29 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189651#comment-14189651
 ] 

Zhan Zhang commented on SPARK-1537:
---

Hi Marcelo,

Do you have update on this? If you don't mind, I can work on your branch to get 
this done asap. Please let me know how do you think?


 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-09-18 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139396#comment-14139396
 ] 

Zhan Zhang commented on SPARK-1537:
---

Do you have any update on this, or any schedule in your mind yet?

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-08-21 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106164#comment-14106164
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

No concrete timeline at the moment. I'm just starting to look at the 2.5.0 
version of ATS so I can incorporate things into my patch.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-08-20 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14104881#comment-14104881
 ] 

Zhan Zhang commented on SPARK-1537:
---

Thanks for sharing this. Do you have concrete plan or timeline for this Jira?

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-08-06 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088438#comment-14088438
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

Current code is here:
https://github.com/vanzin/spark/tree/yarn-timeline

Very much WIP at this point.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-08-05 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086657#comment-14086657
 ] 

Zhan Zhang commented on SPARK-1537:
---

I am also interested in it and trying to integrate spark to yarn timeline 
server. Do you have any concrete plan in mind? I can start prototype it and 
then we can work together on this topic.  How do you think?

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-08-05 Thread Zhan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086808#comment-14086808
 ] 

Zhan Zhang commented on SPARK-1537:
---

Do you mind sharing your thoughts, design document or prototype code?

Thanks.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-08-05 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086817#comment-14086817
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

Currently busy with other more urgent tasks, but I'll push to my repo and post 
a link when I get some time.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2014-07-31 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081465#comment-14081465
 ] 

Marcelo Vanzin commented on SPARK-1537:
---

I'm working on this but this all sort of depends on progress being made on the 
Yarn side, so at this moment I'm not yet ready to send any PRs.

 Add integration with Yarn's Application Timeline Server
 ---

 Key: SPARK-1537
 URL: https://issues.apache.org/jira/browse/SPARK-1537
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin

 It would be nice to have Spark integrate with Yarn's Application Timeline 
 Server (see YARN-321, YARN-1530). This would allow users running Spark on 
 Yarn to have a single place to go for all their history needs, and avoid 
 having to manage a separate service (Spark's built-in server).
 At the moment, there's a working version of the ATS in the Hadoop 2.4 branch, 
 although there is still some ongoing work. But the basics are there, and I 
 wouldn't expect them to change (much) at this point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)