Re: Zeppelin Community Meeting Notes 3/4/2019

2019-03-18 Thread Xun Liu
Thanks Mei! :-)

> 在 2019年3月19日,下午1:07,Jagjeet Malhi  写道:
> 
> Mei - Thanks! 
> 
> 
>> On Mar 18, 2019, at 6:48 PM, Jeff Zhang > > wrote:
>> 
>> Thanks Mei for organizing this, it is very helpful for zeppelin community 
>> 
>> Mei Long mailto:ml...@zepl.com>> 于2019年3月19日周二 上午12:53写道:
>> March 04, 2019
>> 
>> Recording 
>> Moderator: Mei Long
>> Note Taker: Mei Long
>> Kicking off community meetings (Mei Long)
>> Objectives 
>> Format (30 minutes/monthly)
>> Time zone - push it to a later time slot next time for Asia (Jeff Zhang and 
>> the Korea team)
>> Location
>> Zoom
>> Wiki
>> Shared file storage
>> Guidelines
>> Other community activities
>> Moon provides feedback
>> Apache Zeppelin releases (Moonsoo Lee) 
>> Stable release 0.9?
>> Feedback from the community
>> Felix
>> Fred
>> Paul
>> Zeppelin 2.0?
>> Future project proposal and roadmap - Moon
>> Additional comments - Fred opened JIRA: 
>> https://issues.apache.org/jira/browse/ZEPPELIN-4029 
>>  
>> 
>> Action Items
>> 
>> Mei Long
>> 
>> Set up common public area for the following (subject to change)
>> Community meeting guidelines (Apache Zeppelin wiki)
>> Community calendar (gcalendar)
>> Meeting agenda (gdoc)
>> Perm meeting link (zoom: https://zoom.us/my/apachezeppelin 
>> )
>> Meeting notes (dev@zeppelin.apache.org  and 
>> us...@zeppelin.apache.org  )
>> Find a common time slot that works for everybody
>> 
>> Notetaker volunteers always welcome!
>> People will take notes throughout the meeting
>> Feel free to fill your name in Note Taker: and collaborate 
>> 
>> 
>> 
>> -- 
>> Best Regards
>> 
>> Jeff Zhang
> 



Re: Zeppelin Community Meeting Notes 3/4/2019

2019-03-18 Thread Mei Long
It's my pleasure. Thank you all. :-)

On Tue, Mar 19, 2019 at 1:07 AM Jagjeet Malhi  wrote:

> Mei - Thanks!
>
>
> On Mar 18, 2019, at 6:48 PM, Jeff Zhang  wrote:
>
> Thanks Mei for organizing this, it is very helpful for zeppelin community
>
> Mei Long  于2019年3月19日周二 上午12:53写道:
>
>> March 04, 2019
>> Recording 
>>
>> Moderator: Mei Long
>> Note Taker: Mei Long
>>
>>-
>>
>>Kicking off community meetings (Mei Long)
>>- Objectives
>>   - Format (30 minutes/monthly)
>>   - Time zone - push it to a later time slot next time for Asia
>>  (Jeff Zhang and the Korea team)
>>  - Location
>>   - Zoom
>>  - Wiki
>>  - Shared file storage
>>  - Guidelines
>>   - Other community activities
>>   - Moon provides feedback
>>   - Apache Zeppelin releases (Moonsoo Lee)
>>- Stable release 0.9?
>>   - Feedback from the community
>>   - Felix
>>  - Fred
>>  - Paul
>>  - Zeppelin 2.0?
>>   - Future project proposal and roadmap - Moon
>>-
>>
>>Additional comments - Fred opened JIRA:
>>https://issues.apache.org/jira/browse/ZEPPELIN-4029
>>
>> Action Items
>>
>> Mei Long
>>
>>-
>>
>>Set up common public area for the following (subject to change)
>>- Community meeting guidelines (Apache Zeppelin wiki)
>>   - Community calendar (gcalendar)
>>   - Meeting agenda (gdoc)
>>   - Perm meeting link (zoom: https://zoom.us/my/apachezeppelin)
>>   - Meeting notes (dev@zeppelin.apache.org and
>>   us...@zeppelin.apache.org )
>>   -
>>
>>   Find a common time slot that works for everybody
>>
>> Notetaker volunteers always welcome!
>>
>>- People will take notes throughout the meeting
>>- Feel free to fill your name in Note Taker: and collaborate
>>
>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>
>
>


[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266725683
 
 

 ##
 File path: submarine/pom.xml
 ##
 @@ -0,0 +1,166 @@
+
+
+http://maven.apache.org/POM/4.0.0";
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+  4.0.0
+
+  
+zeppelin-interpreter-parent
+org.apache.zeppelin
+0.9.0-SNAPSHOT
+../zeppelin-interpreter-parent/pom.xml
+  
+
+  org.apache.zeppelin
+  zeppelin-submarine
+  jar
+  0.9.0-SNAPSHOT
+  Zeppelin: Submarine interpreter
+
+  
+
+submarine
+2.7.3
+2.4.0
+0.3.8
+20.0
+
+1.3
+  
+
+  
+
+  org.apache.zeppelin
+  zeppelin-python
+  0.9.0-SNAPSHOT
+
+
+  org.apache.zeppelin
+  zeppelin-shell
+  0.9.0-SNAPSHOT
+
+
+  org.hamcrest
+  hamcrest-all
+  ${hamcrest.all.version}
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+  ${hadoop.version}
+  
+
+  org.apache.commons
+  commons-compress
+
+
+  com.google.guava
+  guava
+
+
+  org.codehaus.jackson
+  jackson-mapper-asl
+
+
+  org.codehaus.jackson
+  jackson-xc
+
+
+  org.codehaus.jackson
+  jackson-jaxrs
+
+
+  org.codehaus.jackson
+  jackson-core-asl
+
+  
+
+
+  org.apache.hadoop
+  hadoop-hdfs
+  ${hadoop.version}
+  
+
+  com.google.guava
+  guava
+
+
+  io.netty
+  netty
+
+  
+
+
+  com.hubspot.jinjava
 
 Review comment:
   It's Apache License 2.0.
   https://github.com/HubSpot/jinjava/blob/master/LICENSE


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266725506
 
 

 ##
 File path: 
submarine/src/main/java/org/apache/zeppelin/submarine/IPySubmarineInterpreter.java
 ##
 @@ -0,0 +1,44 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zeppelin.submarine;
+
+import org.apache.zeppelin.interpreter.InterpreterException;
+import org.apache.zeppelin.python.IPythonInterpreter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Properties;
+
+public class IPySubmarineInterpreter extends IPythonInterpreter {
+
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(IPySubmarineInterpreter.class);
+
+  private PySubmarineInterpreter pySubmarineInterpreter;
+
+  public IPySubmarineInterpreter(Properties properties) {
+super(properties);
+  }
+
+  @Override
+  public void open() throws InterpreterException {
+PySubmarineInterpreter pySparkInterpreter =
+
getInterpreterInTheSameSessionByClassName(PySubmarineInterpreter.class, false);
+
+pySubmarineInterpreter
 
 Review comment:
   Currently we are using the `pySubmarineInterpreter`
   Later, when we supported the `IpySubmarineInterpreter`, I will add it again.
   I refer to the code of the `Zeppelin Python Interpreter`,
   I am not familiar with `ipython` at present, the current code, It can be 
used normally,
   Therefore, in order to stabilize the system, still retain this part of the 
code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266724575
 
 

 ##
 File path: 
scripts/docker/submarine/1.0.0/zeppelin-cpu-submarine-interpreter/Dockerfile
 ##
 @@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2:1.0.0
 
 Review comment:
   `Zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2` Is the basic image that we 
need to provide,
   Push to dockerhub for users to use, So now there is no dockerhub inside.
   The docker images for `zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2` are in 
`scripts/docker/submarine/1.0.0/zeppelin-cpu-tensorflow_1.13.1-hadoop_3.1.2/Dockerfile`.
   Regarding the name of the image, felixcheung also made some comments.
   I want to do this:
    Base Docker Image:(Used to run the tensorflow algorithm in yarn)
   ```
   zeppelin hadoop submarine cpu base image name: zeppelin-submarine-cpu:1.0.0, 
Comment: tensorflow_1.13.0-cpu & hadoop_3.1.2
   zeppelin hadoop submarine gpu base image name: zeppelin-submarine-gpu:1.0.0, 
Comment: tensorflow_1.13.0-gpu & hadoop_3.1.2
   ```
   
    Interpreter Docker Image:(Used to run the submarine interpreter in yarn)
   ```
   zeppelin hadoop submarine Interpreter cpu image name: 
zeppelin-submarine-Interpreter-cpu:1.0.0, Comment: tensorflow_1.13.0-cpu & 
hadoop_3.1.2
   zeppelin hadoop submarine Interpreter gpu image name: 
zeppelin-submarine-Interpreter-gpu:1.0.0, Comment: tensorflow_1.13.0-gpu & 
hadoop_3.1.2
   ```
   
    Runtime Mount zeppelin lib
   There is no need to pre-install zeppelin in docker image.
   When zeppelin launches the submarine interpreter via `LaunchOnYarn`,
   Will mount the zeppelin submarine lib jar into the docker container.
   This very mode version of the update also reduces the size of the image


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266724130
 
 

 ##
 File path: submarine/pom.xml
 ##
 @@ -0,0 +1,166 @@
+
+
+http://maven.apache.org/POM/4.0.0";
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+  4.0.0
+
+  
+zeppelin-interpreter-parent
+org.apache.zeppelin
+0.9.0-SNAPSHOT
+../zeppelin-interpreter-parent/pom.xml
+  
+
+  org.apache.zeppelin
+  zeppelin-submarine
+  jar
+  0.9.0-SNAPSHOT
+  Zeppelin: Submarine interpreter
+
+  
+
+submarine
+2.7.3
+2.4.0
+0.3.8
+20.0
+
+1.3
+  
+
+  
+
+  org.apache.zeppelin
+  zeppelin-python
+  0.9.0-SNAPSHOT
+
+
+  org.apache.zeppelin
+  zeppelin-shell
+  0.9.0-SNAPSHOT
+
+
+  org.hamcrest
+  hamcrest-all
+  ${hamcrest.all.version}
+  test
+
+
+  org.apache.hadoop
+  hadoop-common
+  ${hadoop.version}
+  
+
+  org.apache.commons
+  commons-compress
+
+
+  com.google.guava
+  guava
+
+
+  org.codehaus.jackson
+  jackson-mapper-asl
+
+
+  org.codehaus.jackson
+  jackson-xc
+
+
+  org.codehaus.jackson
+  jackson-jaxrs
+
+
+  org.codehaus.jackson
+  jackson-core-asl
+
+  
+
+
+  org.apache.hadoop
+  hadoop-hdfs
+  ${hadoop.version}
+  
+
+  com.google.guava
+  guava
+
+
+  io.netty
+  netty
+
+  
+
+
+  com.hubspot.jinjava
 
 Review comment:
   Can you check its licence ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266723502
 
 

 ##
 File path: 
submarine/src/main/java/org/apache/zeppelin/submarine/IPySubmarineInterpreter.java
 ##
 @@ -0,0 +1,44 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zeppelin.submarine;
+
+import org.apache.zeppelin.interpreter.InterpreterException;
+import org.apache.zeppelin.python.IPythonInterpreter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Properties;
+
+public class IPySubmarineInterpreter extends IPythonInterpreter {
+
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(IPySubmarineInterpreter.class);
+
+  private PySubmarineInterpreter pySubmarineInterpreter;
+
+  public IPySubmarineInterpreter(Properties properties) {
+super(properties);
+  }
+
+  @Override
+  public void open() throws InterpreterException {
+PySubmarineInterpreter pySparkInterpreter =
+
getInterpreterInTheSameSessionByClassName(PySubmarineInterpreter.class, false);
+
+pySubmarineInterpreter
 
 Review comment:
   `pySubmarineInterpreter` is never used ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266722847
 
 

 ##
 File path: 
scripts/docker/submarine/1.0.0/zeppelin-cpu-submarine-interpreter/Dockerfile
 ##
 @@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2:1.0.0
 
 Review comment:
   Sorry, I see it below of this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266722700
 
 

 ##
 File path: 
scripts/docker/submarine/1.0.0/zeppelin-cpu-submarine-interpreter/Dockerfile
 ##
 @@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2:1.0.0
 
 Review comment:
   Where is `zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2` ? I could not find it 
in dockerhub


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266722700
 
 

 ##
 File path: 
scripts/docker/submarine/1.0.0/zeppelin-cpu-submarine-interpreter/Dockerfile
 ##
 @@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+FROM zeppelin-cpu-tensorflow_1.13.0-hadoop_3.1.2:1.0.0
 
 Review comment:
   Where is it ? I could not find it in dockerhub


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
liuxunorg commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266718533
 
 

 ##
 File path: .travis.yml
 ##
 @@ -89,7 +89,7 @@ matrix:
 # Test interpreter modules
 - jdk: "oraclejdk8"
   dist: trusty
-  env: PYTHON="3" SPARKR="true" SCALA_VER="2.10" PROFILE="-Pscala-2.10" 
BUILD_FLAG="install -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" 
MODULES="-pl $(echo 
.,zeppelin-interpreter,zeppelin-interpreter-api,${INTERPRETERS} | sed 
's/!//g')" TEST_PROJECTS=""
+  env: PYTHON="3" SPARKR="true" SCALA_VER="2.10" TENSORFLOW="1.0.0" 
PROFILE="-Pscala-2.10" BUILD_FLAG="install -DskipTests -DskipRat -am" 
TEST_FLAG="test -DskipRat" MODULES="-pl $(echo 
.,zeppelin-interpreter,zeppelin-interpreter-api,${INTERPRETERS} | sed 
's/!//g')" TEST_PROJECTS=""
 
 Review comment:
   In the PySubmarineInterpreterTest case 
`PySubmarineInterpreterTest::testTensorflow()`, just import the tensorflow 
library, then call tensorflow to get the version number function.
   If PySubmarineInterpreterTest fails, an exception is thrown. So the test 
case has nothing to do with the version of tensorflow.
   Official use is through 
`scripts/docker/submarine/1.0.0/zeppelin-cpu-tensorflow_1.13.1-hadoop_3.1.2/Dockerfile`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
zjffdu commented on a change in pull request #3331: [Zeppelin-4049] Hadoop 
Submarine (Machine Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#discussion_r266717421
 
 

 ##
 File path: .travis.yml
 ##
 @@ -89,7 +89,7 @@ matrix:
 # Test interpreter modules
 - jdk: "oraclejdk8"
   dist: trusty
-  env: PYTHON="3" SPARKR="true" SCALA_VER="2.10" PROFILE="-Pscala-2.10" 
BUILD_FLAG="install -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" 
MODULES="-pl $(echo 
.,zeppelin-interpreter,zeppelin-interpreter-api,${INTERPRETERS} | sed 
's/!//g')" TEST_PROJECTS=""
+  env: PYTHON="3" SPARKR="true" SCALA_VER="2.10" TENSORFLOW="1.0.0" 
PROFILE="-Pscala-2.10" BUILD_FLAG="install -DskipTests -DskipRat -am" 
TEST_FLAG="test -DskipRat" MODULES="-pl $(echo 
.,zeppelin-interpreter,zeppelin-interpreter-api,${INTERPRETERS} | sed 
's/!//g')" TEST_PROJECTS=""
 
 Review comment:
   Is tensorflow 1.0.0 the latest version that submarine supports ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [discuss] Zeppelin support workflow

2019-03-18 Thread Xun Liu
Hi, Mei Long

I am very happy to be able to attend the meeting of the zeppelin community. 
What time is the next meeting? Waiting for community email notifications?

Zeppelin workflow's ticket is here, 
https://issues.apache.org/jira/browse/ZEPPELIN-4018 
 
welcome everyone's attention.

> 在 2019年3月19日,上午1:04,Mei Long  写道:
> 
> Very cool! @Xun Liu Would you like to talk about it at our next Apache
> Zeppelin community meeting?
> 
> On Sat, Mar 16, 2019 at 1:00 PM Felix Cheung 
> wrote:
> 
>> I like it!
>> 
>> 
>> From: Jongyoul Lee 
>> Sent: Monday, March 11, 2019 9:05:03 PM
>> To: dev
>> Subject: Re: [discuss] Zeppelin support workflow
>> 
>> Thanks for the sharing this kind of discussion.
>> 
>> I'm interested in it. Will see it.
>> 
>> On Mon, Mar 11, 2019 at 10:43 AM Xun Liu  wrote:
>> 
>>> Hello, everyone
>>> 
>>> Because there are more than 20 interpreters in zeppelin,  Data analysts
>>> can be used to do a variety of data development,
>>> A lot of data development is interdependent.
>>> For example, the development of machine learning algorithms requires
>>> relying on spark to preprocess data, and so on.
>>> 
>>> Zeppelin should have built-in workflow capabilities. Instead of relying
>> on
>>> external software to schedule notes in zeppelin for the following
>> reasons:
>>> 
>>> 1. Now that we have upgraded from the data processing era to the
>> algorithm
>>> era, After zeppelin has its own workflow,
>>> Will have a complete ecosystem of complete data processing and
>> algorithmic
>>> operations.
>>> 2. zeppelin's powerful interactive processing capabilities help algorithm
>>> engineers improve productivity and work.
>>> Zeppelin should give the algorithm engineer more direct control. Instead
>>> of handing the algorithm to other teams(or software) to do the workflow.
>>> 3. zeppelin knows more about the processing status of data than Azkaban
>>> and airflow.
>>> So the built-in workflow will have better performance, user experience
>> and
>>> control.
>>> 
>>> Typical use case
>>> Especially in machine learning, Because machine learning generally has a
>>> long task execution.
>>> A typical example is as follows:
>>> 1) First, obtain data from HDFS through spark;
>>> 2) Clean and convert the data through sparksql;
>>> 3) Feature extraction of data through spark;
>>> 4) Tensorflow writing algorithm through hadoop submarine;
>>> 5) Distribute the tensorflow algorithm as a job to YARN or k8s for batch
>>> processing;
>>> 6) Publish the training acquisition model and provide online prediction
>>> services;
>>> 7) Model prediction by flink;
>>> 8) Receive incremental data through flink for incremental update of the
>>> model;
>>> Therefore, zeppelin is especially required to have the ability to arrange
>>> workflows.
>>> 
>>> I completed the draft of the zeppelin workflow system design, please
>>> review, you can directly modify the document or fill in the comments.
>>> 
>>> JIRA: https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
>>> https://issues.apache.org/jira/browse/ZEPPELIN-4018>
>>> gdoc:
>>> 
>> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit
>>> <
>>> 
>> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit
>>> 
>>> 
>>> 
>>> :-)
>>> 
>>> Xun Liu
>>> 2019-03-11
>> 
>> 
>> 
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>> 



Re: Zeppelin Community Meeting Notes 3/4/2019

2019-03-18 Thread Jeff Zhang
Thanks Mei for organizing this, it is very helpful for zeppelin community

Mei Long  于2019年3月19日周二 上午12:53写道:

> March 04, 2019
>
> Recording 
>
> Moderator: Mei Long
>
> Note Taker: Mei Long
>
>-
>
>Kicking off community meetings (Mei Long)
>-
>
>   Objectives
>   -
>
>   Format (30 minutes/monthly)
>   -
>
>  Time zone - push it to a later time slot next time for Asia
>  (Jeff Zhang and the Korea team)
>  -
>
>   Location
>   -
>
>  Zoom
>  -
>
>  Wiki
>  -
>
>  Shared file storage
>  -
>
>   Guidelines
>   -
>
>   Other community activities
>   -
>
>   Moon provides feedback
>   -
>
>Apache Zeppelin releases (Moonsoo Lee)
>-
>
>   Stable release 0.9?
>   -
>
>   Feedback from the community
>   -
>
>  Felix
>  -
>
>  Fred
>  -
>
>  Paul
>  -
>
>   Zeppelin 2.0?
>   -
>
>Future project proposal and roadmap - Moon
>-
>
>Additional comments - Fred opened JIRA:
>https://issues.apache.org/jira/browse/ZEPPELIN-4029
>
> Action Items
>
> Mei Long
>
>-
>
>Set up common public area for the following (subject to change)
>-
>
>   Community meeting guidelines (Apache Zeppelin wiki)
>   -
>
>   Community calendar (gcalendar)
>   -
>
>   Meeting agenda (gdoc)
>   -
>
>   Perm meeting link (zoom: https://zoom.us/my/apachezeppelin)
>   -
>
>   Meeting notes (dev@zeppelin.apache.org and us...@zeppelin.apache.org
>   )
>   -
>
>   Find a common time slot that works for everybody
>
> Notetaker volunteers always welcome!
>
>-
>
>People will take notes throughout the meeting
>-
>
>Feel free to fill your name in Note Taker: and collaborate
>
>
>

-- 
Best Regards

Jeff Zhang


Re: [discuss] Zeppelin support workflow

2019-03-18 Thread Mei Long
Very cool! @Xun Liu Would you like to talk about it at our next Apache
Zeppelin community meeting?

On Sat, Mar 16, 2019 at 1:00 PM Felix Cheung 
wrote:

> I like it!
>
> 
> From: Jongyoul Lee 
> Sent: Monday, March 11, 2019 9:05:03 PM
> To: dev
> Subject: Re: [discuss] Zeppelin support workflow
>
> Thanks for the sharing this kind of discussion.
>
> I'm interested in it. Will see it.
>
> On Mon, Mar 11, 2019 at 10:43 AM Xun Liu  wrote:
>
> > Hello, everyone
> >
> > Because there are more than 20 interpreters in zeppelin,  Data analysts
> > can be used to do a variety of data development,
> > A lot of data development is interdependent.
> > For example, the development of machine learning algorithms requires
> > relying on spark to preprocess data, and so on.
> >
> > Zeppelin should have built-in workflow capabilities. Instead of relying
> on
> > external software to schedule notes in zeppelin for the following
> reasons:
> >
> > 1. Now that we have upgraded from the data processing era to the
> algorithm
> > era, After zeppelin has its own workflow,
> > Will have a complete ecosystem of complete data processing and
> algorithmic
> > operations.
> > 2. zeppelin's powerful interactive processing capabilities help algorithm
> > engineers improve productivity and work.
> > Zeppelin should give the algorithm engineer more direct control. Instead
> > of handing the algorithm to other teams(or software) to do the workflow.
> > 3. zeppelin knows more about the processing status of data than Azkaban
> > and airflow.
> > So the built-in workflow will have better performance, user experience
> and
> > control.
> >
> > Typical use case
> > Especially in machine learning, Because machine learning generally has a
> > long task execution.
> > A typical example is as follows:
> > 1) First, obtain data from HDFS through spark;
> > 2) Clean and convert the data through sparksql;
> > 3) Feature extraction of data through spark;
> > 4) Tensorflow writing algorithm through hadoop submarine;
> > 5) Distribute the tensorflow algorithm as a job to YARN or k8s for batch
> > processing;
> > 6) Publish the training acquisition model and provide online prediction
> > services;
> > 7) Model prediction by flink;
> > 8) Receive incremental data through flink for incremental update of the
> > model;
> > Therefore, zeppelin is especially required to have the ability to arrange
> > workflows.
> >
> > I completed the draft of the zeppelin workflow system design, please
> > review, you can directly modify the document or fill in the comments.
> >
> > JIRA: https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> > https://issues.apache.org/jira/browse/ZEPPELIN-4018>
> > gdoc:
> >
> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit
> > <
> >
> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit
> >
> >
> >
> > :-)
> >
> > Xun Liu
> > 2019-03-11
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>


Zeppelin Community Meeting Notes 3/4/2019

2019-03-18 Thread Mei Long
March 04, 2019

Recording 

Moderator: Mei Long

Note Taker: Mei Long

   -

   Kicking off community meetings (Mei Long)
   -

  Objectives
  -

  Format (30 minutes/monthly)
  -

 Time zone - push it to a later time slot next time for Asia (Jeff
 Zhang and the Korea team)
 -

  Location
  -

 Zoom
 -

 Wiki
 -

 Shared file storage
 -

  Guidelines
  -

  Other community activities
  -

  Moon provides feedback
  -

   Apache Zeppelin releases (Moonsoo Lee)
   -

  Stable release 0.9?
  -

  Feedback from the community
  -

 Felix
 -

 Fred
 -

 Paul
 -

  Zeppelin 2.0?
  -

   Future project proposal and roadmap - Moon
   -

   Additional comments - Fred opened JIRA:
   https://issues.apache.org/jira/browse/ZEPPELIN-4029

Action Items

Mei Long

   -

   Set up common public area for the following (subject to change)
   -

  Community meeting guidelines (Apache Zeppelin wiki)
  -

  Community calendar (gcalendar)
  -

  Meeting agenda (gdoc)
  -

  Perm meeting link (zoom: https://zoom.us/my/apachezeppelin)
  -

  Meeting notes (dev@zeppelin.apache.org and us...@zeppelin.apache.org )
  -

  Find a common time slot that works for everybody

Notetaker volunteers always welcome!

   -

   People will take notes throughout the meeting
   -

   Feel free to fill your name in Note Taker: and collaborate


[jira] [Created] (ZEPPELIN-4079) SparkShims.getNoteId returns NPE

2019-03-18 Thread Chandana Kithalagama (JIRA)
Chandana Kithalagama created ZEPPELIN-4079:
--

 Summary: SparkShims.getNoteId returns NPE
 Key: ZEPPELIN-4079
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4079
 Project: Zeppelin
  Issue Type: Bug
  Components: zeppelin-interpreter
Affects Versions: 0.8.1
 Environment: zeppelin-0.8.1-bin-netinst running on local machine,

spark interpreter,

2 external dependencies added to the spark interpreter configs:
 - org.apache.spark:spark-streaming-kafka-0-10_2.11:2.4.0  

 - com.typesafe.play:play-json_2.11:2.6.8

Connecting to Spark 2.4.0 running on local machine

$ java -version

java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
Reporter: Chandana Kithalagama


When I run the following Scala program in Zeppelin notebook, A NPE is shown in 
the logs/zeppelin-interpreter-spark-.log file.
{code:java}
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SparkSession

import org.apache.spark.streaming.kafka010._
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferBrokers
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
import org.apache.spark.streaming.{Seconds, StreamingContext, Time}

import org.apache.kafka.common.serialization.StringDeserializer

import play.api.libs.json._

val PREFIX = "CK-LOG --> "

case class SenseData(hash: String, value: Float, updated: String)

/** Lazily instantiated singleton instance of SparkSession */
object SparkSessionSingleton {

  @transient  private var instance: SparkSession = _

  def getInstance(sparkConf: SparkConf): SparkSession = {
if (instance == null) {
  instance = SparkSession
.builder
.config(sparkConf)
.getOrCreate()
}
instance
  }
}

val kafkaParams = Map[String, Object](
  "bootstrap.servers" -> "localhost:9092",
  "key.deserializer" -> classOf[StringDeserializer],
  "value.deserializer" -> classOf[StringDeserializer],
  "group.id" -> "sensor_data-2019",
  "auto.offset.reset" -> "latest",
  "enable.auto.commit" -> (false: java.lang.Boolean)
)

val topics = Array("plugin_topic_536cc303-eb2f-4ff9-b546-d8c59b6c5466")

val streamingContext = new StreamingContext(sc, Seconds(60))
println(PREFIX + "streamContext created")

val stream = KafkaUtils.createDirectStream(
  streamingContext,
  PreferBrokers,
  Subscribe[String, String](topics, kafkaParams)
)
println(PREFIX + "DStream created")

// val msgs = stream.window(Seconds(10))
stream.map(
  record => {
var json: JsValue = Json.parse(record.value)
SenseData(json("b")("notification")("deviceId").as[String], 
json("b")("notification")("parameters")("temperature").as[Float], 
json("b")("notification")("timestamp").as[String])
  }
).foreachRDD( (rdd: RDD[SenseData], time: Time) => {
  println("--- New RDD with " + rdd.partitions.size + " partitions and " + 
rdd.count + " records")

  // Get the singleton instance of SparkSession
  val spark = SparkSessionSingleton.getInstance(rdd.sparkContext.getConf)
  import spark.implicits._

  // this is how to print the rdd in the
  // 
https://spark.apache.org/docs/latest/rdd-programming-guide.html#printing-elements-of-an-rdd
  // rdd.take(100).foreach(println)
  rdd.collect().foreach(println)

  // rdd.toDF() won't work without spark session and imported implicits
  // 
https://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations
  rdd.toDF().createOrReplaceTempView("sensedata")

  val senseDataDF =
spark.sql("select value, updated from sensedata")
  println(s"= $time =")
  senseDataDF.show()
}
)

streamingContext.start(){code}
Error:
{code:java}
ERROR [2019-03-18 17:02:00,021] ({spark-listener-group-shared} 
Logging.scala[logError]:91) - Listener threw an exception
java.lang.NullPointerException
at org.apache.zeppelin.spark.SparkShims.getNoteId(SparkShims.java:96)
at org.apache.zeppelin.spark.SparkShims.buildSparkJobUrl(SparkShims.java:117)
at org.apache.zeppelin.spark.Spark2Shims$1.onJobStart(Spark2Shims.java:44)
at 
org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37)
at 
org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at 
org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91)
at 
org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92)
at 
org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
at 
org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.sc

[GitHub] [zeppelin] liuxunorg commented on issue #3331: [Zeppelin-4049] Hadoop Submarine (Machine Learning) interpreter

2019-03-18 Thread GitBox
liuxunorg commented on issue #3331: [Zeppelin-4049] Hadoop Submarine (Machine 
Learning) interpreter
URL: https://github.com/apache/zeppelin/pull/3331#issuecomment-473922936
 
 
   @zjffdu , I am improving the code based on comments.
   1. Added Tensorflow installation in travis CI, 
   2. Added Tensorflow in submarine test case, Support any version of 
tensorflow.
   3. Added SubmarineInterpreterTest.
   4. Added PySubmarineInterpreterTest.
   
   CI pass https://travis-ci.org/liuxunorg/zeppelin/builds/507793860. 
   Please help me review the code. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] asfgit closed pull request #3334: [ZEPPELIN-4072] Fix CreateNote() throws Note existed exception interrupt travis-ci

2019-03-18 Thread GitBox
asfgit closed pull request #3334: [ZEPPELIN-4072] Fix CreateNote() throws Note 
existed exception interrupt travis-ci
URL: https://github.com/apache/zeppelin/pull/3334
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ZEPPELIN-4078) Ipython improvement

2019-03-18 Thread marc hurabielle (JIRA)
marc hurabielle created ZEPPELIN-4078:
-

 Summary: Ipython improvement
 Key: ZEPPELIN-4078
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4078
 Project: Zeppelin
  Issue Type: Bug
  Components: python-interpreter
Affects Versions: 0.8.1
 Environment: tested in linux / macos. Tested the ipython server 
directly.
Reporter: marc hurabielle


Ipython / ipython server has currently mutliple problem. 


 * Concurrent execution and auto complete can make a paragraph hang forever 
(until restart of the ipython server). Maybe related to 
[https://github.com/jupyter/jupyter_client/issues/429]
 * Out of memory for ipython will make a paragraph hang instead of throw one 
error.
 * High cpu usage. The loop that read from the pub/sub should not try to read 
everytimes the pub sub. Need to be debounce.

Overall most of those bug might be related also to some jupyter_client bug or 
wrong usage.
However, those are the action item:


 * synchronize auto complete / paragraph execution for now
 * check the kernel status when we are checking for the thread
 * sleep time to time when there is no message in pub/sub
 * Use only one queue instead of 3 ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)