It's a global decision on our SMACK stack platform but maybe we will go
for applications only on docker for devops (client of spark). For zeppelin
I dont see the need (no devops)
Le 13 avr. 2016 4:05 PM, "John Omernik" a écrit :
> Is this a specific Docker decision or a Zeppelin on Docker decisi
Is this a specific Docker decision or a Zeppelin on Docker decision. I am
curious on the amount of network traffic Zeppelin actually generates. I
could be around, but I made the assumption that most of the network traffic
with Zeppelin is results from the various endpoints (Spark, JDBC, Elastic
Sea
We decided to not use docker for network performance In production flows
not dor deployment. virtualisation of the network brings 50% decrease In
perf. It may change with calico because it abstract network with routing
not virtualizing like flannel
Le 12 avr. 2016 2:22 PM, "John Omernik" a écrit
On 2. I had some thoughts there. How "expensive" would it be fore
Zeppelin to run a timer of sorts that can be accessed via a specific URL.
Basically, this URL would return the idle time. This thing that knows most
if Zeppelin has activity is Zeppelin. So, any actions within Zeppelin
would reset
Vincent -
On 1, I am curious on the docker/network performance issues. We are
running, granted, some fat pipes on our cluster between nodes, and our
docker registry is actually on the cluster to (backed by MapR FS on all
nodes) Most launches of Zeppelin take under 20 seconds for us, because the
ru
1. I am using ansible to deploy zeppelin on all slaves and to launch
zeppelin instance for one user. So if zeppelin binaries are already
deployed, the launch is very quick through marathon (1 or 2 sec). ooking
for velocity solution (based on jfrog) on Mesos to manage binaries and
artifacts with ver
Thanks John for your insights.
For 2., one solution we have experimented is spark dynamic resource
allocation. We could define a timer to scale down. Hope that helps.
J.
On Mon, Apr 11, 2016 at 4:24 PM, John Omernik wrote:
> 1. Things launch pretty fast for me, however, it depends if the docke
1. Things launch pretty fast for me, however, it depends if the docker
container I am running Zeppelin in is cached on the node mesos wants to run
it on. If not, it pulls from a local docker registry, so worst case, up to
a minute to get things running if the image isn't cached.
2. No, if the user
John & Vincent, I am interested in the per instance per user approach. I
have some questions about this approach:
--
1. how long will it take to launch a Zeppelin instance (and initialize
SparkContext) when user log in?
2. will the instance be destroyed when user log out? if not, how do you
deal wi
Thanks Vincent and John, for providing these viable options.
On Fri, Apr 8, 2016 at 10:39 PM, John Omernik wrote:
> So for us, we are doing something similar to Vincent, however, instead of
> Gluster, we are using MapR-FS and the NFS mount. Basically, this gives us a
> shared filesystem that is
So for us, we are doing something similar to Vincent, however, instead of
Gluster, we are using MapR-FS and the NFS mount. Basically, this gives us a
shared filesystem that is running on all nodes, with strong security
(Filesystem ACEs for fine grained permissions) built in auditing, Posix
complian
Using it for 3 months without any incident
Le 8 avr. 2016 9:09 AM, "ashish rawat" a écrit :
> Sounds great. How long have you been using glusterfs in prod? and have you
> encountered any challenges. The only difficulty for me to use it, would be
> a lack of expertise to fix broken things, so hope
Sounds great. How long have you been using glusterfs in prod? and have you
encountered any challenges. The only difficulty for me to use it, would be
a lack of expertise to fix broken things, so hope it's stability isn't
something to be concerned about.
Regards,
Ashish
On Fri, Apr 8, 2016 at 12:2
use fuse interface. Gluster volume is directly accessible as local storage
on all nodes but performance is only 200 Mb/s. More than enough for
notebooks. For data prefer tachyon/alluxio on top of gluster...
Le 8 avr. 2016 6:35 AM, "ashish rawat" a écrit :
> Thanks Eran and Vincent.
> Eran, I woul
Thanks Eran and Vincent.
Eran, I would definitely like to try it out, since it won't add to the
complexity of my deployment. Would see the S3 implementation, to figure out
how complex it would be.
Vincent,
I haven't explored glusterfs at all. Would it also require to write an
implementation of sto
For 1 marathon on mesos restart zeppelin daemon In case of failure.
For 2 glusterfs fuse mount allows to share notebooks on all mesos nodes.
For 3 not available right now In our design but a manual restart In
zeppelin config page is acceptable for US.
Le 6 avr. 2016 8:18 AM, "Eran Witkon" a écrit
Yes this is correct.
For HA disk, if you don't have HA storage and no access to S3 then AFAIK
you don't have other option at the moment.
If you like to save notebooks to elastic then I suggest you look at the
storage interface and implementation for git and s3 and implement that
yourself. It does s
Thanks Eran. So 3, seems to be something external to Zeppelin, and
hopefully 1 only means running "zeppelin-daemon.sh start" on a slave
machine, when master become inaccessible. Is that correct?
My main concern still remains on the storage front. And I don't really have
high availability disks or
For 1 you need to have both zeppelin web HA and zeppelin deamon HA
For 2 I guess you can use HDFS if you implement the storage interface for
HDFS. But i am not sure.
For 3 I mean that if you connect to an external cluster for example a spark
cluster you need to make sure your spark cluster is HA. O
Thanks Eran for your reply.
For 1) I am assuming that it would similar to HA of any other web
application, i.e. running multiple instances and switching to the backup
server when master is down, is it not the case?
For 2) is it also possible to save it on hdfs?
Can you please explain 3, are you ref
I would say you need to account for these things
1) availability of the zeppelin deamon
2) availability of the notebookd files
3) availability of the interpreters used.
For 1 i don't know of out-of-box solution
For 2 any ha storage will do, s3 or any ha external mounted disk
For 3 it is up to the
Hi,
Is there a suggested architecture to run Zeppelin in high availability
mode. The only option I could find was by saving notebooks to S3. Are there
any options if one is not using AWS?
Regards,
Ashish
22 matches
Mail list logo