[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/22575 Nice! I am looking forward to it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/22575 How should we do if we wanna join two kafka stream and sink the result to another stream? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22575: [SPARK-24630][SS][WIP] Support SQLStreaming in Spark
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/22575 Is this still a WIP? Using isStreaming tag in DDL to mark if a table is streaming or not is brilliant. It keeps compatible with batch queries sql. If possible, I think not introducing STREAM keywords in DML is better to go. Maybe we can use properties(like `isStreaming`) of table participated in query to generate StreamingRelation or batch relation. How do you think? SQLStreaming is important part in SS in my perspective, as it makes SS more complete and usable. Thanks for your work! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22575: [SPARK-24630][SS][WIP] Support SQLStreaming in Spark
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/22575 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16331: [SPARK-18920][HISTORYSERVER]Update outdated date ...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/spark/pull/16331 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16331: [SPARK-18920][HISTORYSERVER]Update outdated date ...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/16331 [SPARK-18920][HISTORYSERVER]Update outdated date formatting ## What changes were proposed in this pull request? Before we show "-" while the timestamp is less than 0, we should update it as now the date string is presented in format "-MM-dd ." ## How was this patch tested? **Before:** ![historyserver-before](https://cloud.githubusercontent.com/assets/5276001/21299300/1bd15ac6-c5d5-11e6-97c3-d51b19da9cd5.JPG) **After:** ![history-after](https://cloud.githubusercontent.com/assets/5276001/21299304/24a34c68-c5d5-11e6-8282-974cf14f0089.JPG) You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark patch-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16331.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16331 commit 1618d1d7855a6c2bdb01387330854b6e6159dd61 Author: Tao Wang <wangtao...@huawei.com> Date: 2016-12-19T02:18:49Z Update outdated date formatting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16031: [SPARK-18606][HISTORYSERVER]remove useless elements whil...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/16031 I agree with you @ajbozarth. Since only one column uses `replace` function we can keep it same as now. If there is more data using this function in the future we will extract it to a simple file or something :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16031: [SPARK-18606][HISTORYSERVER]remove useless elements whil...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/16031 @srowen @ajbozarth Sorry for the delay. I've tried the solution, but found it didn't work, as we already defined the type(`appid-numeric`) which will override value of `sType`. I checked the implementation of `html` sType, it is same with what I proposed in this patch. Here I pasted snapshot: ![226c5a24-d87f-480d-ad00-6c4a4b30560f](https://cloud.githubusercontent.com/assets/5276001/21044121/471b8c12-be35-11e6-96ba-621a31ed3ec9.png) ![924f3273-ae3e-403d-ad9b-4cf00a9d50fc](https://cloud.githubusercontent.com/assets/5276001/21044130/4e4a5590-be35-11e6-85ee-a02b4b042aa0.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16031: [SPARK-18606][HISTORYSERVER]remove useless elemen...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/spark/pull/16031#discussion_r89758734 --- Diff: core/src/main/resources/org/apache/spark/ui/static/historypage.js --- @@ -78,6 +78,12 @@ jQuery.extend( jQuery.fn.dataTableExt.oSort, { } } ); +jQuery.extend( jQuery.fn.dataTableExt.ofnSearch, { +"appid-numeric": function ( a ) { +return a.replace(/[\r\n]/g, " ").replace(/<.*?>/g, ""); --- End diff -- Refer to `jquery.dataTables.1.10.4.min.js`. I'd like to change it to better style if there's any :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16031: [SPARK-18606][HISTORYSERVER]remove useless elemen...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/16031 [SPARK-18606][HISTORYSERVER]remove useless elements while searching ## What changes were proposed in this pull request? When we search applications in HistoryServer, it will include all contents between tag, which including useless elemtns like "https://cloud.githubusercontent.com/assets/5276001/20662840/28bcc874-b590-11e6-9115-12fb64e49898.jpg) After: ![after](https://cloud.githubusercontent.com/assets/5276001/20662844/2f717af2-b590-11e6-97dc-a48b08a54247.jpg) You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark span Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16031.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16031 commit 37aa3a2d2fddfa46fb4c5427cebed5683530153d Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-11-28T08:37:13Z remove useless elements while searching --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 It is slways showing UTC in the main page, but with server timezone in other pages like last page I've pasted. Not sure if it's true. Any way we'll waiting for the results from @windpiger , thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 It's always better to show a timezone in table header, i think, no matter what the timezone it really uses. But changing to show GMT/UTC always? I have to say it's a bold move even there's a case that many drivers run at different timezones. If we're sure to do this, we need to make it known to users in form of like writing it into release notes, as UI is the most frequently-used interface. Current fix makes some sense, but easily makes people confused: when they check applications list, the time says "The time zone is GMT, and the time is 2016-01-01 15:00", but when they check details of applications they get "The Job 1 of this application starts at 2016-01-01 14:00 or 2016-01-01 12:30". This kind of inconsistance is unacceptable from my point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 I would like solution 1, as other time string in Spark UI shows, like JobPage: ![default](https://cloud.githubusercontent.com/assets/5276001/20390037/c8b4c7a0-ad07-11e6-8c80-6f7f62ce0acf.PNG) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 Do we have a guy who's good at JAX-RS? maybe he can explain the theory and help us to understand better :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 @srowen Before the code changes, browser get date string from server side, now instead it get Date(this conclusion comes from codes debugging(https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala#L46), I'm not sure personally, please correct me if I'm wrong), and parse its string format(`hacks a date string to drop seconds and timezone`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 @ajbozarth In browser side the timezone used to build Date from epoch time is the one **at browser side**, not that one in **History Server side**. These two are different in many cases. So yes, it is harder. :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 I think the problem is that Date tranfered in REST ways take no timezone, one possible reason is : http://stackoverflow.com/questions/23730062/use-iso-8601-dates-in-jax-rs-responses. I'm not a expert in REST server or jax-rs framework, so I got another solution: transferring Date using its string format rather than Date object itself. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 ![historyserver](https://cloud.githubusercontent.com/assets/5276001/20304529/af3d0c30-ab6b-11e6-887d-fbf8fb09ebab.jpg) Like what showed in image, user can get app infos in two ways: using browser and using REST api. When a browser send a http request, History Server will return history page(a html page) containing application infos in which the Date is translated to string. Because the translation is done in server side, so the date string is represented with server timezone. If user uses REST api to get app infos, History Server will return json objects containing Date object w/o timezone, so the REST response always shows date like ` "startTime" : "2016-11-11T06:27:25.802GMT"`, which means in GMT time(as no timezone available). After 2.0, one way accessing History Server changed: browser ways(the red highlighten part). Browser will get app infos using REST api rather than getting whole history page (in order to generate table in front end to implementing sort and search). The result it got only containing Date w/o timezone instead of date string translated by Server using server timezone. The way browser get app datas changed, that's why the browser behavior changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 I'll post how UI works and what changes it did to be different before later :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15838: [SPARK-18396][HISTORYSERVER]"Duration" column makes sear...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15838 Is it good to go? @srowen @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15838: [SPARK-18396]"Duration" column makes search resul...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/15838 [SPARK-18396]"Duration" column makes search result confused, maybe we should make it unsearchable ## What changes were proposed in this pull request? When we search data in History Server, it will check if any columns contains the search string. Duration is represented as long value in table, so if we search simple string like "003", "111", the duration containing "003", â111â will be showed, which make not much sense to users. We cannot simply transfer the long value to meaning format like "1 h", "3.2 min" because they are also used for sorting. Better way to handle it is ban "Duration" columns from searching. ## How was this patch tested? manually tests. I will pasted the UI lately. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark duration Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15838.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15838 commit 8c9ef2141819aa7ab6fc0d243f61f280d4533d4c Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-11-10T08:34:06Z make duration un-searchable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 Nah i think we have a misunderstand here. @tgravescs If understand right, what you mean is that most companies run their server in UTC timezone. That's OK. Under that condition, History Server UI should show date using UTC time. But for servers which runs in other timezone, History Server UI should show date using same timezone as the backend server uses. Currently History Server UI shows date in GMT what ever timezone the server runs at(after the big change: https://github.com/apache/spark/commit/e4c1162b6b3dbc8fc95cfe75c6e0bc2915575fb2). I think that's the point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 I agree with showing the timezone with date string. But always using GMT/UTC time is not a good choice, logs of application(using log4j) usually are printed using local timezone(like CST). That means if I wanna check what happens using application logs with EventLog, I must do the translation between them manually. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15803: [SPARK-18298][Web UI]change gmt time to local zone time ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15803 thanks for the fix. This patch parse the timestamp instead of the Date String returned. The REST api still return the GMT time, which is insistent with UI showing. I've googled and found one possible reason: http://stackoverflow.com/questions/23730062/use-iso-8601-dates-in-jax-rs-responses, which is caused by (de)seriallization. Currently, i think there're two solutions for this: 1. use this patch, let UI show GMT time. If user wanna translate it to other timezone, they can use epoch like: "startTimeEpoch" etc. 2. We transfer the Date with its timezone and get time in local no matter we use UI or REST api. How do you think guys? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15176: [WIP][SPARK-17610][CORE][SCHEDULER]The failed stage caus...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15176 I will close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15176: [WIP][SPARK-17610][CORE][SCHEDULER]The failed sta...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/spark/pull/15176 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15176: [WIP][SPARK-17610][CORE][SCHEDULER]The failed stage caus...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/15176 after second glimpse, I found this current behavior will not cause problems because `dagScheduler.handletaskCompletion` is triggerred by `CompletionEvent`. Dagscheduler handle events one by one, so the ResubmitFailedStages posted in `dagScheduler.handleTaskCompletion` will not be handled before the end of `dagscheduler.handleTaskCompletion`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15176: [SPARK-17610][CORE][SCHEDULER]The failed stage ca...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/15176 [SPARK-17610][CORE][SCHEDULER]The failed stage caused by FetchFailed may never be resubmitted ## What changes were proposed in this pull request? The improper time order in handling failed task caused by FetchFailed may cause corresponding stage not being resubmitted ever. For details, see: https://issues.apache.org/jira/browse/SPARK-17610 ## How was this patch tested? manual tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-17610 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15176.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15176 commit 910fd71bb9503c442817c04324f0843574c44fd5 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-09-21T07:29:28Z The failed stage caused by FetchFailed may never be resubmitted --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14605: [SPARK-17022][YARN]Handle potential deadlock in d...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/14605 [SPARK-17022][YARN]Handle potential deadlock in driver handling messages ## What changes were proposed in this pull request? We directly send RequestExecutors to AM instead of transfer it to yarnShedulerBackend first. ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark lock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14605.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14605 commit 80c2d1127318f1315c7ba59bc42162c757f08c73 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-08-11T15:49:03Z [SPARK-17022]Handle potential deadlock in driver handling messages --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/spark/pull/14591 @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14591: [SPARK-17010][MINOR][DOC]Wrong description in mem...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/14591 [SPARK-17010][MINOR][DOC]Wrong description in memory management document ## What changes were proposed in this pull request? change the remain percent to right one. ## How was this patch tested? Manual review You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14591.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14591 commit 9d9bc2ae1420d91cea7779f38d329579e1ec126a Author: Tao Wang <wangtao...@huawei.com> Date: 2016-08-11T02:44:53Z Update tuning.md --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15667][SQL]Throw exception if columns number of o...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/13409 cc @liancheng @andrewor14 Could you please review this? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15667][SQL]Throw exception if columns number of o...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/13409 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15667][SQL]Throw exception if columns number of o...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/13409 [SPARK-15667][SQL]Throw exception if columns number of outputs mismatch the inputs ## What changes were proposed in this pull request? We will throw exception if the columns number of inputs and outputs mismatch in driver side, instead of ArrayIndexOutOfBoundsException in executor side. ## How was this patch tested? run below sql statements: create table test(school string,province string,a int); create table p3(school string,province string); insert into table test select * from p3; the driver will throw exception which message will be "Cannot insert into target table because column number/types are different ". You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark parition Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13409.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13409 commit ca269a4ef12496975e00c2a6f4bdbd04fd41568e Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-05-31T11:41:20Z Throw exception if columns number of outputs mismatch the inputs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14258]change scope of functions in Kafk...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/12053 [SPARK-14258]change scope of functions in KafkaCluster ## What changes were proposed in this pull request? changing scopes of some functions to minus. ## How was this patch tested? unit tests in Jenkins. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark kc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12053.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12053 commit d91f36f49112847095d56ed11022bf7fd0b0d5f8 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-03-30T01:18:01Z change scope of functions in KafkaCluster --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-200374692 It will not impact on actuall result, but a error stacktrace and log showing failure will make user confused and easy to believe that the application is failed and needed to be submitted again. Like I said above, we return 2-tuple in which the last one (`FinalApplicationStatus `) is not used, the first one said application is finished so it is ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-198427974 so, how about it guys? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-198742134 have you tried to reproduce the scenaro and see what happend? The `UndeclaredThrowableException` will be caught by `e.getCause.isInstanceOf[InterruptedException]`, i think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-197188289 `monitorApplication` returns 2-tuple, in which FinalApplicationStatus is not used when sc stops normally. that's why it doesn't matter what status we set it to. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-197135484 the application is finished successfully(RM UI also show success state) but log shows it failed, that's the problem i think. yeah you're right sleep method can throw InterruptedException. this pr is trying to fix the problem we find in RM HA switching. what i am trying to say is that interrupting a monitor thread should not print the failed message in log. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-197120333 @jerryshao yes, the problem is that client side's log will throw exception and show app is failed. more details are in log i pasted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-197118357 @tgravescs I reproduce it and the error message like: >> 16/03/16 10:29:33 INFO YarnClientSchedulerBackend: Shutting down all executors 16/03/16 10:29:33 ERROR Client: Failed to contact YARN for application application_1457924833801_0003. java.lang.reflect.UndeclaredThrowableException at com.sun.proxy.$Proxy10.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:431) at org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:221) at org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:882) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:144) Caused by: java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:155) ... 5 more 16/03/16 10:29:33 INFO YarnClientSchedulerBackend: Asking each executor to shut down 16/03/16 10:29:33 ERROR YarnClientSchedulerBackend: Yarn application has already exited with state FAILED! 16/03/16 10:29:33 INFO SparkContext: SparkContext already stopped. 16/03/16 10:29:33 INFO YarnClientSchedulerBackend: Stopped There's no "Interrupting monitor thread" and the exception is UndeclaredThrowableException caused by InterruptedException. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-196908886 I've only observed exception caused by InterruptedException but not itself directly, thought it should be wrapped internally. The status in RM is ok as it is decided by ApplicationMaster not spark client. In my recall i didn't see the message "Interrupting monitor thread" but not 100% sure. I will try to reproduce it and confirm. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-196898154 for another concern about the final application status returned, we don't need too much worry as it is barely used by the codes who invoke this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-196889060 hi @tgravescs , it happened when sc stop normally in client mode. As sc.stop will stop dagscheduler -> stop taskscheduler -> stop scheduler backend -> interrupt the monitor thread, the monitor thread will enter into a retry logic in which there is a sleep intervals(which is not the sleep [here](https://github.com/WangTaoTheTonic/spark/blob/SPARK-13852/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L970)) waiting for RM's switching. The sleep methods will throw an InterruptedException when it is interrupted, so we need to catch it because it will log the application failed as treated as NonFatal(e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11692#issuecomment-196325232 @srowen thanks for your comments. I've changed it, please check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/spark/pull/11692#discussion_r55998595 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -976,6 +976,11 @@ private[spark] class Client( logError(s"Application $appId not found.") return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED) case NonFatal(e) => +if (e.isInstanceOf[InterruptedException] --- End diff -- then the code segments will be seperated into two parts. i am not sure it's better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852][YARN]handle the InterruptedExcep...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/spark/pull/11692#discussion_r55973772 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -976,6 +976,11 @@ private[spark] class Client( logError(s"Application $appId not found.") return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED) case NonFatal(e) => +if (e.isInstanceOf[InterruptedException] --- End diff -- we can only move InterruptedException to above but not exception cause by it. `just move InterruptedException` or `leave these two here`, which option do you think is better? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13852]handle the InterruptedException c...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/11692 [SPARK-13852]handle the InterruptedException caused by YARN HA switch when sc stops, it will interrupt thread using to monitor app status. the thread will throw an InterruptedException if YARN is switch as there is a sleep method in retry logic. If YARN is switch between active and standby, sc.stop will return YarnApplicationState.FAILED as the InterruptedException is not caught. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-13852 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11692.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11692 commit 9d6de234b2c54c94b9fa261d78e5471facb3b6cd Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2016-03-14T08:29:05Z handle the InterruptedException caused by YARN HA switch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13478] [yarn] Use real user when fetchi...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/11358#issuecomment-193776099 hi @vanzin, how about spark sql in this issue, in your view? as in spark sql it will revoke SessionState.start in which will finally connected to metastore in Hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750][WEB UI] Add https support to the ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/10238#issuecomment-169207523 It seems conflicted :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750][WEB UI] Add https support to the ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/10238#issuecomment-167507929 Sorry for late. Think you need update codes to the newest and see if the test cases fine. Then we do some review and function tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8676][SQL]Lazy start event logger in sq...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/spark/pull/8048 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8676][SQL]Lazy start event logger in sq...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/8048#issuecomment-167198843 No prob. will close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5664#issuecomment-156339123 ok once jacky raise the PR, I will close this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-155067126 I think it can be used for who have their custom hive hosted on their own maven repository. Though using maven to download hive metastore jars is using for test, I still think this is a nice feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5664#issuecomment-152442824 @jacek-lewandowski Okay I will rebase this in a week or so. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750][WEB UI]Add Https support for Web ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5664#issuecomment-152686993 @jacek-lewandowski Sure. I'm glad for this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-11100 HiveThriftServer not registering w...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/9113#issuecomment-147977645 with this patch, can we launch multiple thrift server instances and enable client to submit application without sensing particular thrift server addresses instead of zookeeper port? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-11100 HiveThriftServer not registering w...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/9113#issuecomment-148255083 Yeah I think it is a nice feature which can bring HA into spark Thrift Server. Have you testet it yet? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [SPARK-10902] [SQL] Improve sess...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-146812382 Nice work! As it is a blocker could we merge this into branch-1.5 too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8552] [THRIFTSERVER] Using incorrect da...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7118#issuecomment-144035049 @marmbrus How do you think this fix? As the issue priority is very high, I think we better fix it ASAP. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4449][Core]Specify port range in spark
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5722#issuecomment-143799075 I think we better keep backwards compatibility and treat [min, max] as a fine grained control on exact port while the "only 8080" with max retries as a coarse one, which means if user config a range, let's ignore the function of max retries and if user specify a exact value for a port, we retry from the config value until we retrive the max retries. Maybe weakening the max retries config in first step and removing it in next few releases is better idea. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4122][STREAMING] Add a library that can...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/2994#issuecomment-143799798 Hi @harishreedharan, I think this is a nice feature which is very helpful for user who tries to write DStream back into Kafka. The implement is very neat too. Could you please rebase the code so that we have the opptunity to push it into spark? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4449][Core]Specify port range in spark
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5722#issuecomment-143107363 Hi guys, sorry for seeing your comments so late. I think we already come up with an agreement with two cases this patch wanna fix: One is that some ports should bind to specified value so that client could connect to them without retries. The other one is that for some random port (start at 0) we wanna control their ranges too. And the main problem is that current code is propagating port range as a string everywhere, the better way is converting port argument at parsing stage then passing them as (min, max) in arguments list. Did I understand it in right way? If yes I will rebase/rewrite the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5719]allow daemons to bind to specified...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/spark/pull/4505 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5719]allow daemons to bind to specified...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/4505#issuecomment-142156332 @JoshRosen @andrewor14 @srowen @pfxuan As the network related configurations need to be considered globally, I will close this for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8552] [THRIFTSERVER] Using incorrect da...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7118#issuecomment-139505601 @navis Thanks you for the fix. I have tested "use $database" on local Thrift Server and the function is ok. There might still have a test failed need to fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8552] [THRIFTSERVER] Using incorrect da...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/spark/pull/7118#discussion_r3981 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala --- @@ -57,10 +58,9 @@ import org.apache.spark.util.{CircularBuffer, Utils} * @param initClassLoader the classloader used when creating the `state` field of --- End diff -- The comments need to be updated with its construct args. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8552] [THRIFTSERVER] Using incorrect da...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7118#issuecomment-139136144 I tested this patch, but got this error when executing "show databases;" using beeline, >>15/09/10 15:11:02 INFO SessionState: Created HDFS directory: /tmp/hive/root/fc5c8bbe-0e63-49f1-8286-ec51c4432b94/_tmp_space.db 15/09/10 15:11:02 INFO HiveSessionImpl: Operation log session directory is created: /tmp/root/operation_logs/fc5c8bbe-0e63-49f1-8286-ec51c4432b94 15/09/10 15:11:26 ERROR SparkExecuteStatementOperation: Error running hive query as user : root java.lang.NullPointerException at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:182) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8676][SQL]Lazy start event logger in sq...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/8048 [SPARK-8676][SQL]Lazy start event logger in sql application to avoid TGT expiring in l⦠â¦ong connection Now in Thrift Server/Spark SQL, it will login first in `Client.scala`, then open the file stream in event logger. After that the login action will be executed twice again, one of them is in `SparkSQLCLIService`, and another is in `ThriftHttpCLIService/ThriftBinaryCLIService`. If we open a long-connected file stream to HDFS between two login, the UserGroupInformation that RPC took will be refreshed by second login. After the TGT expires, it will cause No invalid credentials excepiton. So in SQL application, we will start event logger only after the HiveServer2 being launched. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-8676 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8048.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8048 commit 8ca90467cb31da40da27a30e4dae6c73ada862d9 Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-08-08T06:54:25Z Lazy start event logger in sql application to avoid TGT expiring in long connection --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7931#issuecomment-127896125 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7931#issuecomment-128045456 @marmbrus I am not sure why it will leads to a test failed. Could you help to check this patch and the failed reason? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7931#issuecomment-128232430 @marmbrus Looks like the patch is ok with excluding hive classes. Thanks for your guide :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/7931 [SPARK-9596][SQL]treat hadoop classes as shared one in IsolatedClientLoader https://issues.apache.org/jira/browse/SPARK-9596 You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-9596 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7931.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7931 commit 3abe66c8c3e23e619e8b8414f66c3b5a3f4d729a Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-08-04T12:52:43Z treat hadoop classes as shared one --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7931#issuecomment-127658671 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7931#issuecomment-127808147 The error is: stderr FAILED: SemanticException [Error 10072]: Database does not exist: hive_test_db [info] stderr Exception in thread main org.apache.spark.sql.execution.QueryExecutionException: FAILED: SemanticException [Error 10072]: Database does not exist: hive_test_db I should not be produced by this patch. Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7931#issuecomment-127811490 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9496][SQL]do not print the password in ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/7815#issuecomment-126584002 @rxin Like javax.jdo.option.ConnectionPassword, until now I only find this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9496][SQL]do not print the password in ...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/7815 [SPARK-9496][SQL]do not print the password in config https://issues.apache.org/jira/browse/SPARK-9496 We better do not print the password in log. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7815.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7815 commit c7a51457ca94b29b917eec10e44935db21c540d6 Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-07-31T03:30:18Z do not print the password in config --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4449][Core]Specify port range in spark
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5722#issuecomment-118211126 @andrewor14 There are two motivations about this, first is mention in https://github.com/apache/spark/pull/3314, we would like to control the retry range when user set some port to 0, but @srowen came up this idea which is setting each port to a range. The other motivation is mentioned in https://github.com/apache/spark/pull/5657. When we start a public port on a specified setting, it better not to retry becasue once the retry happens client will fail to connect it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-5111][SQL]HiveContext and Thriftserver ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/4064#issuecomment-112679214 @zhzhan Hey Could you describe the error and your configurations in detail please? As we now use Hive 13 + Hadoop 2.7 in our product and never ran into this. And now Spark could support working with Hive 14 now, per https://github.com/apache/spark/commit/4eb48ed1dadee80d78ada5d15884dd348c46ad27. Is there still this issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8392] Improve the efficiency
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/spark/pull/6839#discussion_r32518242 --- Diff: core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala --- @@ -70,6 +70,13 @@ private[ui] class RDDOperationCluster(val id: String, private var _name: String) def getAllNodes: Seq[RDDOperationNode] = { _childNodes ++ _childClusters.flatMap(_.childNodes) } + + /** Return all the node which are cached. */ --- End diff -- Nit: Return all the node`s` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8290]spark class command builder need r...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/6741 [SPARK-8290]spark class command builder need read SPARK_JAVA_OPTS SPARK_JAVA_OPTS was missed in reconstructing the launcher part, we should add it back so spark-class could read it. The missing part is [here](https://github.com/apache/spark/blob/1c30afdf94b27e1ad65df0735575306e65d148a1/bin/spark-class#L97). You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-8290 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6741.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6741 commit e3135204e386c7b46f37b0ace848de611b5ade86 Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-06-10T07:08:25Z spark class command builder need read SPARK_JAVA_OPTS --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8290]spark class command builder need r...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6741#issuecomment-110823733 From the old codes we can see daemons like Master or Worker use `SPARK_DAEMON_JAVA_OPTS` and the rest use `SPARK_JAVA_OPTS`. Maybe this is not recommended to be used or deprecated(didn't do a deep research), and if so we should remove it clearly in future instead of missing it in some branch now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8290]spark class command builder need r...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6741#issuecomment-110803555 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8290]spark class command builder need r...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6741#issuecomment-110803206 I'm not sure I understood what u mean totally. Before SPARK_JAVA_OPTS will be injected to any class launched by `spark-class`, but was omitted after. If we run a `beeline`(will go through `spark-class`), it will be very convenient to set extra java options. So does other process launched by `spark-class`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8290]spark class command builder need r...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6741#issuecomment-110829628 @vanzin Thanks for comments. I already have added the missing SPARK_DRIVER_MEMORY(oh shame for not finding that) and modified the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8290]spark class command builder need r...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6741#issuecomment-110816364 Looks like something is wrong with RAT. Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8273]Driver hangs up when yarn shutdown...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6717#issuecomment-110992692 Oh Thanks andrew. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8273]Driver hangs up when yarn shutdown...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/6717 [SPARK-8273]Driver hangs up when yarn shutdown in client mode In client mode, if yarn was shut down with spark application running, the application will hang up after several retries(default: 30) because the exception throwed by YarnClientImpl could not be caught by upper level, we should exit in case that user can not be aware that. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-8273 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6717.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6717 commit 28752d6a3914e29abb741e8d1750bb2f0ceee7b4 Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-06-09T09:24:15Z catch the throwed exception --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8273]Driver hangs up when yarn shutdown...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6717#issuecomment-110358988 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8065] [hive] Add support for Hive 0.14 ...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6627#issuecomment-109951833 A high level question: how about the hive 1.0/1.1/1.2? It might be hard to support so many versions if there's no compatibility between them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor]make the launcher project name consiste...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/6603 [Minor]make the launcher project name consistent with others I found this by chance while building spark and think it is better to keep its name consistent with other sub-projects (Spark Project *). I am not gonna file JIRA as it is a pretty small issue. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark projName Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6603.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6603 commit 994b3ba60b810862342a05276acfa59869cb8171 Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-06-03T02:42:00Z make the project name consistent --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7889] make sure click the App ID on H...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6545#issuecomment-107430955 Hey Sean, what we wanna fix is the load once, show them forever issue. That is to say, when user click a link on history page, the provider will load an application information from persistent layer into memory and will never update it. So if the click happened with application incompleted, then no matter what the followed operations are, the history page will show the incompleted infomation ever. Obviously it is a problem. We have not much experience at Jetty so maybe it is not a perfect solution by now. Hope someone would give their feedback. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7524][SPARK-7846]add configs for keytab...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6051#issuecomment-107007797 @harishreedharan How about the confirm? Should we do some changes in branh 1.4? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7945][CORE]Do trim to values in propert...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6496#issuecomment-106998580 I've done a test with current code, it worked out fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7945][CORE]Do trim to values in propert...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/6496#issuecomment-106787780 What more confusion would this cause? And I cann't think of another solution to address this perfectly so did same as `Utils.getPropertiesFromFile`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7945][CORE]Do trim to values in propert...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/6496 [SPARK-7945][CORE]Do trim to values in properties file https://issues.apache.org/jira/browse/SPARK-7945 Now applications submited by org.apache.spark.launcher.Main read properties file without doing trim to values in it. If user left a space after a value(say spark.driver.extraClassPath) then it probably affect global functions(like some jar could not be included in the classpath), so we should do it like Utils.getPropertiesFromFile. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/spark SPARK-7945 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6496.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6496 commit 2c053a1d8976f9505c8d0247e520303bcd1d7e25 Author: WangTaoTheTonic wangtao...@huawei.com Date: 2015-05-29T09:45:02Z Do trim to values in properties file --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4449][Core]Specify port range in spark
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5722#issuecomment-106792667 Looks like @vanzin is okay about this, what more concerns do you have? @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org