So interesting, now for Hive Interpreter it's working.  I figured there are
a few "gotchas" in that when I created an individual conf directory for
each user,  and specified that, it was still looking in the
$ZEPPELIN_HOME/conf  I.e. my interpreters.json was read from there. When I
renamed $ZEPPELIN_HOME/conf it wouldn't start and give me errors. That's a
separate issue, but one worth noting.

Back to the Hive interpreter.  One question I have is the "timing" of the
starting of interpreters.  From what I can can gather, when an interpreter
is called, Zeppelin looks to see if it's started, if not it will start it,
and then try to connect to it.  In some cases, it "looks" like Zeppelin may
be trying to connect to the interpreter BEFORE it's actually ready to
accept connections.  (I don't know this for sure, I am just observing)
Basically I am seeing the connection attempt happen, it failing, then some
time after there is the interpreter process running.

When I try to run the interpreter again (after it's instantiated) I get an
error in the interpreter log file related to the instance not being found
(not sure why that is) (see below for example)

All of this seems to come back to timing, so I guess  I am asking those who
know more than me, is there a delay between starting the interpreter and
then connecting to it? Does it try again if not there?  Is there a time
out?  Am I barking up the wrong tree with my timing/order of operations
questions?



org.apache.thrift.TException:
org.apache.zeppelin.interpreter.InterpreterException: Interpreter instance
org.apache.zeppelin.spark.PySparkInterpreter not found

at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getInterpreter(RemoteInterpreterServer.java:168)

at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getFormType(RemoteInterpreterServer.java:305)

at
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getFormType.getResult(RemoteInterpreterService.java:959)

at
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getFormType.getResult(RemoteInterpreterService.java:944)

at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.zeppelin.interpreter.InterpreterException:
Interpreter instance org.apache.zeppelin.spark.PySparkInterpreter not found

On Tue, Jun 16, 2015 at 9:19 AM, John Omernik <j...@omernik.com> wrote:

> As a follow-up with the zeppelin installation read/write. I get the same
> error messages, I don't think that is what's causing my issue.
>
> On Tue, Jun 16, 2015 at 9:10 AM, John Omernik <j...@omernik.com> wrote:
>
>> Thanks Brian:
>>
>> Here's how I have it setup.  The "install directory" is NFS mounted and
>> mounted Readonly.  I did this on purpose as I was hoping to share this.
>>  using the ZEPPELIN_HOME (RO) ZEPPELIN_CONF_DIR (NFS mounted, per user
>> read/write) ZEPPELIN_LOG_DIR (Set to a local directory in the container,
>> /tmp) and ZEPPELING_PID_DIR (Set to local dir in container, /tmp) and
>> ZEPPELIN_NOTEBOOK_DIR (NFS mounted Read write).
>>
>> Basically those locations are connected to the container at run time and
>> there are proper permissions there.   in the container, I add a user with
>> the UID as the user on the NFS system and run the zeppelin process in the
>> container as that user, so that should be good too.   Other than ./conf and
>> the log/pid stuff, are there any other directories in that require
>> read/write access? I can try to run them with read/write rather than
>> read-only and see what happens (I'll do that here next)
>>
>> On Tue, Jun 16, 2015 at 8:40 AM, Brian McDevitt <
>> brian.mcdev...@nerdery.com> wrote:
>>
>>> I'd first check that the user that's running zeppelin has ownership of
>>> the zeppelin installation and has appropriate rights to read any additional
>>> files you might need.
>>>
>>> Hope that helps,
>>> Brian
>>>
>>> Thanks,
>>> Brian McDevitt
>>> Software Engineer
>>> The Nerdery
>>>
>>> On Tue, Jun 16, 2015 at 8:25 AM, John Omernik <j...@omernik.com> wrote:
>>>
>>>> Hey all, I am running into an interesting problem and I think I am
>>>> getting to the end of my ability to troubleshoot so I thought I'd list
>>>> things out here and see if anyone has more ideas for next steps in
>>>> troubleshooting.
>>>>
>>>> I am running a Docker Container I built in Mesos.  I can get things up
>>>> and running, things seem happy and healthy until I try to run a command
>>>> with an interpreter.  At that point I am getting strange errors about
>>>> connections refused.  I put the errors below (from the Notebook, log files
>>>> and from Std Err) for clarity. But the basic thing I saw was "connection
>>>> refused". So I put tcpdump on the container and went to trouble shoot what
>>>> was happening. (TCPdump below too) it looks like it's trying to connect to
>>>> localhost 36365 which is the port the interpreter was started on, but after
>>>> the initial syn, it's getting a rst-ack.  I've validated in netstat, and
>>>> that port IS listening on all interfaces, so I am not sure why it's
>>>> providing the rst ack.
>>>>
>>>> One hunch is around the hostname that interpreter is listening.  The
>>>> Hostname I connect to in the webui is zeppelin.marathon.mesos (I am using
>>>> mesos dns and haproxy-bridge) however perhaps that is causing the thrift
>>>> server to deny something that said, it's connecting to local host, and it's
>>>> not even getting to the app level (just SYN -> RST/ACK) so I am not sure
>>>> how or why that would be occurring.
>>>>
>>>> I guess based on what I have seen, this SHOULD work. i.e. even though
>>>> I've only exposed the UI and the web sockets port to the client, the docker
>>>> container should be able to connect locally to any newly opened ports. The
>>>> interpreter is starting fine.. so I guess are there any other steps I
>>>> should take to try and trouble shoot?
>>>>
>>>> Thanks
>>>>
>>>> John
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Only thing in hive interpreter log:
>>>>
>>>>  INFO [2015-06-16 13:07:07,150] ({Thread-0}
>>>> RemoteInterpreterServer.java[run]:95) - Starting remote interpreter server
>>>> on port 36365
>>>>
>>>>
>>>> tcpdump from container:
>>>>
>>>> 13:06:50.967696 IP 127.0.0.1.38133 > 127.0.0.1.36365: Flags [S], seq
>>>> 340951329, win 65535, options [mss 65495,sackOK,TS val 300975824 ecr
>>>> 0,nop,wscale 7], length 0
>>>>
>>>> .R.!.........0.........
>>>>
>>>> ............
>>>>
>>>> 13:06:50.967716 IP 127.0.0.1.36365 > 127.0.0.1.38133: Flags [R.], seq
>>>> 0, ack 340951330, win 0, length 0
>>>>
>>>> .......R."P....V.....
>>>>
>>>> 13:06:51.468191 IP 127.0.0.1.38137 > 127.0.0.1.36365: Flags [S], seq
>>>> 3821372812, win 65535, options [mss 65495,sackOK,TS val 300975949 ecr
>>>> 0,nop,wscale 7], length 0
>>>>
>>>> .............0.........
>>>>
>>>> ...M........
>>>>
>>>> 13:06:51.468216 IP 127.0.0.1.36365 > 127.0.0.1.38137: Flags [R.], seq
>>>> 0, ack 3821372813, win 0, length 0
>>>>
>>>> ..........P...%t.....
>>>>
>>>> 13:06:51.968677 IP 127.0.0.1.38142 > 127.0.0.1.36365: Flags [S], seq
>>>> 2630719687, win 65535, options [mss 65495,sackOK,TS val 300976074 ecr
>>>> 0,nop,wscale 7], length 0
>>>>
>>>> .............0.........
>>>>
>>>> ............
>>>>
>>>> 13:06:51.968693 IP 127.0.0.1.36365 > 127.0.0.1.38142: Flags [R.], seq
>>>> 0, ack 2630719688, win 0, length 0
>>>>
>>>> ..........P...Y,.....
>>>>
>>>> 13:06:52.469035 IP 127.0.0.1.38146 > 127.0.0.1.36365: Flags [S], seq
>>>> 976891692, win 65535, options [mss 65495,sackOK,TS val 300976199 ecr
>>>> 0,nop,wscale 7], length 0
>>>>
>>>> ::/,.........0.........
>>>>
>>>> ...G........
>>>>
>>>> 13:06:52.469052 IP 127.0.0.1.36365 > 127.0.0.1.38146: Flags [R.], seq
>>>> 0, ack 976891693, win 0, length 0
>>>>
>>>> ......::/-P...%W.....
>>>>
>>>> Error in Logs:
>>>>
>>>>  INFO [2015-06-16 13:06:50,953] ({pool-1-thread-2}
>>>> SchedulerFactory.java[jobStarted]:132) - Job
>>>> paragraph_1434047295030_-1730740540 started by scheduler
>>>> remoteinterpreter_236878590
>>>>
>>>>  INFO [2015-06-16 13:06:50,954] ({pool-1-thread-2}
>>>> Paragraph.java[jobRun]:194) - run paragraph 20150611-132815_546208121 using
>>>> hive org.apache.zeppelin.interpreter.LazyOpenInterpreter@12aa010c
>>>>
>>>>  INFO [2015-06-16 13:06:50,966] ({pool-1-thread-2}
>>>> RemoteInterpreterProcess.java[reference]:107) - Run interpreter process
>>>> /zeppelin/bin/interpreter.sh -d /zeppelin/interpreter/hive -p 36365
>>>>
>>>> ERROR [2015-06-16 13:06:56,023] ({Thread-35}
>>>> RemoteScheduler.java[getStatus]:226) - Can't get status information
>>>>
>>>> org.apache.zeppelin.interpreter.InterpreterException:
>>>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>>>> Connection refused
>>>>
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
>>>>
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
>>>>
>>>> at
>>>> org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
>>>>
>>>> at
>>>> org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
>>>>
>>>> at
>>>> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
>>>>
>>>> at
>>>> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
>>>>
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:138)
>>>>
>>>> at
>>>> org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:224)
>>>>
>>>> at
>>>> org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.run(RemoteScheduler.java:183)
>>>>
>>>> Caused by: org.apache.thrift.transport.TTransportException:
>>>> java.net.ConnectException: Connection refused
>>>>
>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
>>>>
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
>>>>
>>>> ... 8 more
>>>>
>>>>
>>>>
>>>>
>>>> Error in Std Err:
>>>>
>>>> org.apache.zeppelin.interpreter.InterpreterException: 
>>>> org.apache.zeppelin.interpreter.InterpreterException: 
>>>> org.apache.thrift.transport.TTransportException: 
>>>> java.net.ConnectException: Connection refused
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:135)
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:249)
>>>>    at 
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104)
>>>>    at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:202)
>>>>    at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
>>>>    at 
>>>> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:296)
>>>>    at 
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>    at 
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>>>>    at 
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>>>>    at 
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>    at 
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>    at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: org.apache.zeppelin.interpreter.InterpreterException: 
>>>> org.apache.thrift.transport.TTransportException: 
>>>> java.net.ConnectException: Connection refused
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
>>>>    at 
>>>> org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
>>>>    at 
>>>> org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
>>>>    at 
>>>> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
>>>>    at 
>>>> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:138)
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:133)
>>>>    ... 12 more
>>>> Caused by: org.apache.thrift.transport.TTransportException: 
>>>> java.net.ConnectException: Connection refused
>>>>    at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
>>>>    at 
>>>> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
>>>>    ... 19 more
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>    at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>    at 
>>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>>>>    at 
>>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>>>>    at 
>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>>>>    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>    at java.net.Socket.connect(Socket.java:579)
>>>>    at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
>>>>    ... 20 more
>>>>
>>>> Error in the Notebook:
>>>>
>>>>
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:135)
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:249)
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104)
>>>> org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:202)
>>>> org.apache.zeppelin.scheduler.Job.run(Job.java:170)
>>>> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:296)
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>> java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> java.lang.Thread.run(Thread.java:745)
>>>>
>>>
>>>
>>
>

Reply via email to