[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually
liuxunorg commented on a change in pull request #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually URL: https://github.com/apache/zeppelin/pull/3342#discussion_r268963065 ## File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreter.java ## @@ -103,7 +103,7 @@ public String getSessionId() { } public synchronized RemoteInterpreterProcess getOrCreateInterpreterProcess() throws IOException { -if (this.interpreterProcess != null) { +if (this.interpreterProcess != null && interpreterProcess.isRunning()) { return this.interpreterProcess; } ManagedInterpreterGroup intpGroup = getInterpreterGroup(); Review comment: The original code is not perfect. Because `RemoteInterpreter.java::getOrCreateInterpreterProcess()` is called, an available RemoteInterpreter process must be returned. `interpreterProcess.isRunning()` Will call `RemoteInterpreterUtils.java::checkIfRemoteEndpointAccessible()` Check if the remote interpreter is available through the socket. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually
liuxunorg commented on a change in pull request #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually URL: https://github.com/apache/zeppelin/pull/3342#discussion_r268965044 ## File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/ManagedInterpreterGroup.java ## @@ -58,10 +58,19 @@ public InterpreterSetting getInterpreterSetting() { public synchronized RemoteInterpreterProcess getOrCreateInterpreterProcess(String userName, Properties properties) throws IOException { +if (remoteIntpProcessIsShutdown()) { + LOGGER.info("Check whether the InterpreterProcess has been shutdown."); + // clean invalid session and dirty data of interpreterSetting + close(); Review comment: `remoteIntpProcessIsShutdown()==true`, Found that the remote interpreter process is unavailable, clean invalid session by close() function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually
liuxunorg commented on a change in pull request #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually URL: https://github.com/apache/zeppelin/pull/3342#discussion_r268967854 ## File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/ManagedInterpreterGroup.java ## @@ -141,17 +165,27 @@ private void close(Collection interpreters) { private void closeInterpreter(Interpreter interpreter) { Scheduler scheduler = interpreter.getScheduler(); -for (final Job job : scheduler.getAllJobs()) { - job.abort(); - job.setStatus(Job.Status.ABORT); - LOGGER.info("Job " + job.getJobName() + " aborted "); -} +// Need to abort the task being executed +// when actively shutting down the remote interpreter +if (false == remoteIntpProcessIsShutdown()) { Review comment: ### 1. If interpreter process is normal, isRunning() must equal true `isRunning()` is not a flag, It's a function call. `interpreterProcess.isRunning()` Will call `RemoteInterpreterUtils.java::checkIfRemoteEndpointAccessible()` , Check if the remote interpreter is available through the socket. ### 2. Avoid invalid calls `Job.abort()` Will go to the remote to call the interpreter, `Interpreter.cancel(getInterpreterContext(null));` function. If the interpreter is no longer connected, The `interpreter.cancel()` function cannot be called correctly. But if you find that the remote interpreter is not available in `job.abort()`, Will pass again `RemoteInterpreter.java::getOrCreateInterpreterProcess()`, Trying to create an interpreter process, This will fail due to a 30 second timeout. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] liuxunorg commented on issue #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually
liuxunorg commented on issue #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually URL: https://github.com/apache/zeppelin/pull/3342#issuecomment-476505669 This bug, It's not easy to get through the code review to understand the situation. Better way, It is verified by testing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] zjffdu commented on issue #3339: [ZEPPELIN-4078] handle ipython kernel crash
zjffdu commented on issue #3339: [ZEPPELIN-4078] handle ipython kernel crash URL: https://github.com/apache/zeppelin/pull/3339#issuecomment-476545011 @AyWa Before I merge it, could you create a dedicated ticket for this PR. I found you have 2 PRs for the same ticket ZEPPELIN-4078 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] zjffdu edited a comment on issue #3339: [ZEPPELIN-4078] handle ipython kernel crash
zjffdu edited a comment on issue #3339: [ZEPPELIN-4078] handle ipython kernel crash URL: https://github.com/apache/zeppelin/pull/3339#issuecomment-476545011 @AyWa Before I merge it, could you create a dedicated ticket for this PR ? I found you have 2 PRs for the same ticket ZEPPELIN-4078 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] AyWa commented on issue #3339: [ZEPPELIN-4078] handle ipython kernel crash
AyWa commented on issue #3339: [ZEPPELIN-4078] handle ipython kernel crash URL: https://github.com/apache/zeppelin/pull/3339#issuecomment-476549536 Yeah sure. I will create subtask for my 3 pr. (for now they all share the same jira issue ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (ZEPPELIN-4089) Execution hang after ipython kernel die
marc hurabielle created ZEPPELIN-4089: - Summary: Execution hang after ipython kernel die Key: ZEPPELIN-4089 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4089 Project: Zeppelin Issue Type: Sub-task Reporter: marc hurabielle Assignee: marc hurabielle When an ipython paragraph is running, and the ipython kernel is dying. (like an out of memory), the interpreter will hang and never complete the execution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [zeppelin] AyWa commented on issue #3339: [ZEPPELIN-4089] handle ipython kernel crash
AyWa commented on issue #3339: [ZEPPELIN-4089] handle ipython kernel crash URL: https://github.com/apache/zeppelin/pull/3339#issuecomment-476569603 @zjffdu updated the jira for this pr. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (ZEPPELIN-4090) Ipython CPU / queue improvement
marc hurabielle created ZEPPELIN-4090: - Summary: Ipython CPU / queue improvement Key: ZEPPELIN-4090 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4090 Project: Zeppelin Issue Type: Sub-task Reporter: marc hurabielle Assignee: marc hurabielle Ipython / ipython server has currently a problem of High cpu usage. The loop that read from the pub/sub should not try to read everytimes the pub sub. It Needs to be debounce. those are the action item: * sleep time to time when there is no message in pub/sub * Use only one queue instead of 3 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZEPPELIN-4091) Ipython hang when concurrent auto complete / run
marc hurabielle created ZEPPELIN-4091: - Summary: Ipython hang when concurrent auto complete / run Key: ZEPPELIN-4091 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4091 Project: Zeppelin Issue Type: Sub-task Reporter: marc hurabielle Assignee: marc hurabielle Ipython / ipython server has a problem when concurrent execution and auto complete are execute in same time / parallel. It can make a paragraph hang forever (until restart of the ipython server). Maybe related to [https://github.com/jupyter/jupyter_client/issues/429] Overall most of those bug might be related also to some jupyter_client bug or wrong usage. However, those are the action item: * synchronize auto complete / paragraph execution for now -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [zeppelin] asfgit closed pull request #3339: [ZEPPELIN-4089] handle ipython kernel crash
asfgit closed pull request #3339: [ZEPPELIN-4089] handle ipython kernel crash URL: https://github.com/apache/zeppelin/pull/3339 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] Leemoonsoo commented on issue #3341: ZEPPELIN-4038. Deprecate spark 2.2 and earlier
Leemoonsoo commented on issue #3341: ZEPPELIN-4038. Deprecate spark 2.2 and earlier URL: https://github.com/apache/zeppelin/pull/3341#issuecomment-476744033 I've seen companies still depends on spark 1.6 and spark 2.2. And guess many of them can not upgrade spark easily. How about make longer term plan of spark support, depreciation, and publish it on website or documentation? e.g. Support last N spark releases. Deprecates after X. I think sharing long term plan will help users to have the right expectations on using Zeppelin with Spark. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[DISCUSS] Notebook serving
Hi, There're some challenges bringing a model inside notebook to a production environment. Many many organizations, the most common practice I see today is something like 1. Data scientist develop a model in a data science notebook. 2. SW engineer rewrites the model, to meet the production requirements. In other words, data scientists do not have self-service capability. And the organization is spending a lot of time for reimplementing model for production. I tried to identify the gaps between data science notebook and production environment, and what can possibly address them. So models that created by data scientists in the notebook can go production with minimum efforts. I made a proposal to solve this problem. Please review and comment. Any ideas and feedbacks are welcome. You can make a modification if needed. https://docs.google.com/document/d/1YA6q8W9yO8a88xzLDYs9zv_fKu2_cnB58rmQbakxi1I/edit?usp=sharing This document is linked from https://issues.apache.org/jira/browse/ZEPPELIN-3994 Thanks, moon
[GitHub] [zeppelin] zjffdu commented on issue #3341: ZEPPELIN-4038. Deprecate spark 2.2 and earlier
zjffdu commented on issue #3341: ZEPPELIN-4038. Deprecate spark 2.2 and earlier URL: https://github.com/apache/zeppelin/pull/3341#issuecomment-476920721 Thanks @Leemoonsoo , it's a good point, my initial plan is to support them before zeppelin 1.0 The final not-supported date is not determined yet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] zjffdu commented on issue #3343: [ZEPPELIN-4063] Don't include noteId for constructing Interpreter GroupId when under isolated per user mode
zjffdu commented on issue #3343: [ZEPPELIN-4063] Don't include noteId for constructing Interpreter GroupId when under isolated per user mode URL: https://github.com/apache/zeppelin/pull/3343#issuecomment-476922318 Will merge if no more comment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] zjffdu commented on issue #3338: ZEPPELIN-4081. when the python process is killed, the task state is still running
zjffdu commented on issue #3338: ZEPPELIN-4081. when the python process is killed,the task state is still running URL: https://github.com/apache/zeppelin/pull/3338#issuecomment-476922250 Will merge if no more comments This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (ZEPPELIN-4092) Upgrade livy to 0.6
Jeff Zhang created ZEPPELIN-4092: Summary: Upgrade livy to 0.6 Key: ZEPPELIN-4092 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4092 Project: Zeppelin Issue Type: Improvement Reporter: Jeff Zhang Livy 0.6 RC voting is passed, it would be nice to upgrade livy in zeppelin to 0.6 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [zeppelin] jongyoul commented on issue #3335: [MINOR] Refactor CronJob class
jongyoul commented on issue #3335: [MINOR] Refactor CronJob class URL: https://github.com/apache/zeppelin/pull/3335#issuecomment-476978469 To review easily, this PR prevent from running quartz thread even if we don't set and use scheduler features. Applying this PR will lead a quartz schedule to be initialized only if we enable this feature in a configuration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (ZEPPELIN-4093) Replace some listeners and callback to use queue
Jongyoul Lee created ZEPPELIN-4093: -- Summary: Replace some listeners and callback to use queue Key: ZEPPELIN-4093 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4093 Project: Zeppelin Issue Type: Improvement Reporter: Jongyoul Lee Assignee: Jongyoul Lee Zeppelin has many listeners and callbacks to communicate with components and pass the behaviors to other components. This makes hard to read the code and predict behaviors. It would be better to use queues in order to pass messages to other components and websockets as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [zeppelin] felixcheung commented on issue #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually
felixcheung commented on issue #3342: [ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually URL: https://github.com/apache/zeppelin/pull/3342#issuecomment-476984843 ok thanks for explaining. im ok with this. IMO might be worthwhile to refactor the code to make it more straightforward perhaps? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [zeppelin] felixcheung commented on issue #3341: ZEPPELIN-4038. Deprecate spark 2.2 and earlier
felixcheung commented on issue #3341: ZEPPELIN-4038. Deprecate spark 2.2 and earlier URL: https://github.com/apache/zeppelin/pull/3341#issuecomment-476985444 that's a fair point. maybe we should keep for 0.9 and 0.8 and remove only in 1.0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services