> "Concerning #2, Zeppelin's backend has a lot of map and the map has
mutable states. Thus we should also think of read-write-lock."

Agree, the impl of ConcurrentHashMap deals with this by splitting the
internal map into buckets and the access to each bucket is synchronized so
that if you have 2 concurrent operations on the map and they fall into
different buckets it doesn't incur any lock

For List, the CopyOnWriteArrayList is expensive on write because it copies
the content of the list for each mutation. But this impl makes the
assumption that the read ratio is much higher than the write ratio.

Etc ... We need to check on case by case basis which impl is the best

> "Personally, moving to Java8 is very attractive but we divide it with
others because upgrading to the version of JDK influences users'
experiences."

Upgrading to Java 8 would require users to upgrade JDK server-side only,
what is the impact on user experience ?

On Wed, Oct 5, 2016 at 8:08 AM, Jongyoul Lee <jongy...@gmail.com> wrote:

> Thanks for starting this thread.
>
> Concerning #2, Zeppelin's backend has a lot of map and the map has mutable
> states. Thus we should also think of read-write-lock.
>
> Personally, moving to Java8 is very attractive but we divide it with others
> because upgrading to the version of JDK influences users' experiences.
>
> On Wed, Oct 5, 2016 at 11:28 AM, Anthony Corbacho <
> anthonycorba...@apache.org> wrote:
>
> > I think about the abuse of @Inject and circular deps, it is just matter
> of
> > education.
> >
> > On Tue, Oct 4, 2016 at 8:05 PM, DuyHai Doan <doanduy...@gmail.com>
> wrote:
> >
> > > About DI I have no strong opinion on the topic.
> > >
> > > I have coded frameworks with just manual DI (through constructor and
> > > context objects) and it works pretty well, even for a big project, as
> > long
> > > as the context objects have meaningfull names
> > >
> > > Using DI frameworks like Spring or Guice is also a valid choice,
> > especially
> > > for a backend. The only thing to be really cautious about are circular
> > > dependencies. Using @Inject is very easy and people tend to abuse it
> > > everywhere and end up with horrible cyclic dependencies
> > >
> > >
> > >
> > > On Tue, Oct 4, 2016 at 12:54 PM, Anthony Corbacho <
> > > anthonycorba...@apache.org> wrote:
> > >
> > > > You made my day, this is the kind of email i really like !!
> > > >
> > > > I think its a great idea and i am willing to spend sometime on it.
> > > >
> > > > I also want to move to a DI (guice) architecture , let me know what
> you
> > > > think about it.
> > > >
> > > > On Tuesday, 4 October 2016, DuyHai Doan <doanduy...@gmail.com>
> wrote:
> > > >
> > > > > Hello devs
> > > > >
> > > > > The code base of Zeppelin has grown very fast in the last 12 months
> > and
> > > > > it's great. It means that we have more and more contributors.
> > > > >
> > > > > However, to make the project maintainable at long term, we need
> > regular
> > > > > code refactoring.
> > > > >
> > > > > I have some ideas to share with you
> > > > >
> > > > > 1) Use Java 8 to benefit from Lambda & streams.
> > > > >
> > > > >   Now that Java 8 is well established, it is a good time to upgrade
> > the
> > > > > project. I believe some interpreters also need Java 8. Cassandra
> > > > > interpreter right now does not have unit tests for the latest
> > features
> > > > > because the Embedded Cassandra server used for testing requires
> Java
> > 8.
> > > > >
> > > > >  It would also be a good opportunity to go through the code base
> and
> > > > > replace some boilerplate for() loop with manual filtering by the
> > stream
> > > > > shortcut :  list.stream().filter(..).map(). It would improve
> greatly
> > > > code
> > > > > readability
> > > > >
> > > > > 2) Multi threading
> > > > >
> > > > >  I've seen the usage of synchronize block at a few places in the
> code
> > > > base.
> > > > > Although perfectly valid, it has a cost at runtime and since more
> and
> > > > more
> > > > > people are asking for multi-tenancy or using a single Zeppelin
> > instance
> > > > to
> > > > > server multiple users, I guess the synchronized blocks has a huge
> > cost.
> > > > >
> > > > > There are some solid alternatives:
> > > > >
> > > > >  - ConcurrentHashMap if we synchronized on a map
> > > > >  - CopyOnWriteArrayList if we synchronized on a list.
> > > > >
> > > > > Of cours each sychronize block should be taken carefully not to
> > > introduce
> > > > > regression
> > > > >
> > > > > 3) Thread management
> > > > >
> > > > > I've seen some usage of new Thread() {...}.run(); it may be a good
> > time
> > > > to
> > > > > introduce ThreadPool and pass them along (inside context objects
> for
> > > > > example) to have a more centralized thread management
> > > > >
> > > > > The advantage of having thread pool is that we can manage them in a
> > > > single
> > > > > place, monitor them and expose the info through JMX and also
> control
> > > > system
> > > > > resource by defining max thread number and thread pool queue
> > > > >
> > > > > 4) Server monitoring
> > > > > I hear many users on the field complain about the fact that they
> have
> > > to
> > > > > restart Zeppelin server regularly because it "hangs" after running
> a
> > > long
> > > > > time.
> > > > >
> > > > > If we can expose some system metrics through JMX, it would help
> > people
> > > > > monitor the state of Zeppelin server and take appropriate actions
> > > > >
> > > > > Right now we may only focus on monitoring the server itself, not
> the
> > > > > interpreter JVMs processes. It can be done in a 2nd step
> > > > >
> > > > >
> > > > > What do you think about the ideas ?
> > > > >
> > > >
> > >
> >
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>

Reply via email to