One idea Once we merge AngularObjectRegistry with ResourcePool, it will be a good idea to expose some utility methods like 'getResource(xxx)', 'putResource(yyy)' and 'removeResource(zzz)' directly on the InterpreterContext object so that any interpreter can use them
On Sat, Apr 23, 2016 at 9:59 AM, DuyHai Doan <[email protected]> wrote: > "I'd like to see that Flink have access to the 'z' object. " > > --> You're taking the problem at the wrong side. > > You need to access the 'z' object not for the object itself but to be able > to call its functions, namely 'z.angular(xxx)' right ? > > If you look at the source code, the AngularObjectRegistry is available > from the InterpreterContext object itself, with a little bit > of code, see here: > https://github.com/apache/incubator-zeppelin/blob/master/spark/src/main/java/org/apache/zeppelin/spark/ZeppelinContext.java#L370-L384 > > So basically, inside the Flink interpreter, you can as well call this > piece of code and achieve the same goal > > The 'z.angular()' method is merely a syntactic sugar method to simplify > AngularObjectRegistry interaction > > "But the Angular binds don't need to be Spark specific (e.g. living in > the ZeppelinContext which requires a SparkContext as a constructor)." > > --> And it isn't Spark specific, it can be retrieve from > InterpreterContext itself > > > On Sat, Apr 23, 2016 at 12:27 AM, Trevor Grant <[email protected]> > wrote: > >> First of all, awesome work on what you've done here. Appreciating it more >> and more, the more I grok. >> >> Second of all, thank for the Cassandra snippit. I realized we are talking >> about slightly different things. >> You are talking about ${var} >> >> I wanted something closer to this: >> >> %flink >> import org.apache.zeppelin.interpreter.InterpreterContext >> val resourcePool = InterpreterContext.get().getResourcePool() >> resourcePool.put("foo", "bar") >> >> import org.apache.zeppelin.interpreter.InterpreterContext >> resourcePool: org.apache.zeppelin.resource.ResourcePool = >> org.apache.zeppelin.resource.DistributedResourcePool@21d07d88 >> >> ---------------------------------- >> %spark z.get("foo") >> >> res4: Object = bar >> >> ^^ This actually works, so I can move on with my day. >> >> Continuing the discussion: >> >> I'd like to see that Flink have access to the 'z' object. OR, if that is >> deprecated- I hope to see something calling this out in your PR of >> documentation. E.g. using resource pools. I'm not a complete idiot, but it >> took me some time to dig through code to figure this one out (and comments >> of this thread). I think variable passing is one of the coolest things of >> a Zeppelin setup. People should be aware that it's a thing and how to do. >> >> Re: Zeppelin being Spark Centric. I say that because the zeppelin context >> is really wrapped up in the Spark interpreter and vice versa. For cripes >> sake, the Spark Context is required for the constructor of the Zeppelin >> Context: >> (This isn't related to your pull request / fine work) >> >> Currently it is something like this: >> >> class SparkInterpreter { >> // basic interpreter stuff >> // fancy interpreter fixes >> // special Zeppelin interpreter magic >> } >> >> class ZeppelinContext( SparkContext ) { >> // all the binding / watching / other cool stuff >> } >> >> class FlinkInterpreter { >> // basic interpreter stuff >> } >> >> class IgniteInterpreter { >> // basic interpreter stuff, but not standardized so patches and fixes >> don't always work as expected and now all interpretters have slightly >> different implementation bc they aren't homogenized. >> } >> >> >> I propose something more like this: >> class ZeppelinIntp { >> // common resource pools >> // etc >> } >> object ZeppelinIntp { >> // common resource pools >> } >> >> class ScalaIntp { >> // everything for a well oiled and highly functioning scala interpreter >> } >> >> object SparkScalaIntp extends ScalaIntp (sparkParams, ZeppelinIntp, ...){ >> // do spark specific things >> } >> >> object FlinkScalaIntp extends ScalaIntp (flinkParams, ZeppelinIntp, ...){ >> // do flink specific things >> } >> >> object IgniteScalaIntp extends ScalaIntp (igniteParams, ZeppelinIntp, >> ...){ >> // do ignite specific things >> } >> >> Yea, I know this is a major refactor, but the problem is going to get >> worse >> as time goes on. >> >> The zeppelin context-spark context may not be worth splitting out- those >> two are really entangled, and for any concievable case the most we would >> want to pass back and forth can be handled by the resource pools. But the >> Angular binds don't need to be Spark specific (e.g. living in the >> ZeppelinContext which requires a SparkContext as a constructor). If >> anything it would make more sense for those to live inside Flink bc it is >> true streaming as opposed to Spark Mini-batching (which comes to the >> scala-shell in v1.1). >> >> Also, I really believe the over arching classes that handle language >> behavior and parsing ought to be off in their own modules. >> >> Possibly a thing for v 0.7? >> >> >> >> >> >> Trevor Grant >> Data Scientist >> https://github.com/rawkintrevo >> http://stackexchange.com/users/3002022/rawkintrevo >> http://trevorgrant.org >> >> *"Fortunate is he, who is able to know the causes of things." -Virgil* >> >> >> On Fri, Apr 22, 2016 at 4:37 PM, DuyHai Doan <[email protected]> >> wrote: >> >> > "Back to my original post, I essentially want to add Flink to that list" >> > >> > In that case, inside the Flink interpreter source code, everytime the >> input >> > parser encouters a ${variable} pattern, you have to access the >> > AngularObjectRegistry and replace the template by the actual variable >> > value. >> > >> > It is the responsibility for each interpreter to implement variable >> > interpolation (${var}) >> > >> > I did it for the Cassandra interpreter using my own syntax ( {{var}} ) : >> > >> > >> https://github.com/apache/incubator-zeppelin/blob/master/cassandra/src/main/scala/org/apache/zeppelin/cassandra/InterpreterLogic.scala#L306-L327 >> > >> > >> > "was looking through your resourcePools. I am under the impression I can >> > use >> > those to pass a variable from one paragraph to another, in an akward >> sort >> > of fasion (but I may be going about it all wrong). Supposing that can be >> > done (or possibly is already done, but I haven't read the PRs you >> > listed carefully), >> > it would solve what I want to do for the time being." >> > >> > I will create an epic to merge angular object with resource pools to >> keep >> > only one abstraction. But it doesn't solve the fundamental problem, >> which >> > is IF an interpreter wants to use variables stored in resource pool, it >> HAS >> > to implement it. >> > >> > The only way we can mutualise code for variable binding is to let >> Zeppelin >> > Engine pre-process the input text bloc of each paragraph and perform >> > variable lookup from Resource Pool then variable replace, and after that >> > forward the text block to the interpreter itself. >> > >> > I think it is a good idea but it would require some refactoring and may >> > break existing behaviors if some interpreter already implemented their >> own >> > variable template handling >> > >> > >> > >> > "2) If we want to keep the code base compact and clean, would it be >> wiser >> > to refactor in a less Spark-centric way?" >> > >> > There is nothing Spark centric here if we're talking about variable >> > sharing, it applies to all interpreters >> > >> > >> > On Fri, Apr 22, 2016 at 11:24 PM, Trevor Grant < >> [email protected]> >> > wrote: >> > >> > > If I'm reading https://issues.apache.org/jira/browse/ZEPPELIN-635 >> > > correctly- this integrates the spark, markdown, and shell >> interpreters. >> > > >> > > Back to my original post, I essentially want to add Flink to that >> list. >> > > >> > > To your point about keeping a small and managable code-base: Under >> the >> > > hood it seems like Zeppelin is a front end for Spark and oh btw, here >> are >> > > some hacks to make other stuff work too. For instance there is a lot >> of >> > > code reusage in any scala based interpreter. Wouldn't it make more >> sense >> > > to have a generic Scala interpreter and extend it for special quirks >> of >> > > each interpreter as needed, e.g. for the variable bindings of the >> > > particular interpreter, and loading configurations. Consider the >> > companion >> > > object bug, essentially the same code had to be copy and pasted >> across 4 >> > > interpreters, and the Ignite interpreter (as I recall) never even got >> the >> > > fix because of a quirk in the way the tests are written for that >> > > interpreter. >> > > >> > > I was looking through your resourcePools. I am under the impression I >> can >> > > use those to pass a variable from one paragraph to another, in an >> akward >> > > sort of fasion (but I may be going about it all wrong). Supposing that >> > can >> > > be done (or possibly is already done, but I haven't read the PRs you >> > listed >> > > carefully), it would solve what I want to do for the time being. >> > > >> > > Also consider the Python Flink I want to add to this, there will once >> > again >> > > be a lot of duplication of code from the Spark Python interpreter. A >> > > generic Python interpreter also seems like a more reasonable approach >> > here. >> > > >> > > So basically I've broken this conversation into two parts- >> > > 1) I'm trying to pass variables/object back and forth between >> > > Spark/Flink/Angular/etc. Please help. Seems possible but I'm having a >> > slow >> > > time figuring it out >> > > 2) If we want to keep the code base compact and clean, would it be >> wiser >> > to >> > > refactor in a less Spark-centric way? >> > > >> > > >> > > >> > > >> > > Trevor Grant >> > > Data Scientist >> > > https://github.com/rawkintrevo >> > > http://stackexchange.com/users/3002022/rawkintrevo >> > > http://trevorgrant.org >> > > >> > > *"Fortunate is he, who is able to know the causes of things." >> -Virgil* >> > > >> > > >> > > On Fri, Apr 22, 2016 at 3:41 PM, DuyHai Doan <[email protected]> >> > wrote: >> > > >> > > > In this case, it is already implemented. >> > > > >> > > > Look at those merged PR: >> > > > >> > > > - https://github.com/apache/incubator-zeppelin/pull/739 >> > > > - https://github.com/apache/incubator-zeppelin/pull/740 >> > > > - https://github.com/apache/incubator-zeppelin/pull/741 >> > > > - https://github.com/apache/incubator-zeppelin/pull/742 >> > > > - https://github.com/apache/incubator-zeppelin/pull/744 >> > > > - https://github.com/apache/incubator-zeppelin/pull/745 >> > > > - https://github.com/apache/incubator-zeppelin/pull/832 >> > > > >> > > > There is one last JIRA pending for documentation, I'll do a PR for >> this >> > > > next week: https://issues.apache.org/jira/browse/ZEPPELIN-742 >> > > > >> > > > On Fri, Apr 22, 2016 at 9:52 PM, Trevor Grant < >> > [email protected]> >> > > > wrote: >> > > > >> > > > > I want to be able to put/get/watch variables. Specifically so I >> can >> > > > > interface with AngularJS for visualizations. >> > > > > >> > > > > I've been groking the codebase trying to find a less invasive way >> to >> > do >> > > > > this. >> > > > > >> > > > > I get wanting to keep the code base clean but sharing variables >> is a >> > > > really >> > > > > nice feature set and shouldn't be that hard to implement? >> > > > > >> > > > > Thoughts? >> > > > > >> > > > > Trevor Grant >> > > > > Data Scientist >> > > > > https://github.com/rawkintrevo >> > > > > http://stackexchange.com/users/3002022/rawkintrevo >> > > > > http://trevorgrant.org >> > > > > >> > > > > *"Fortunate is he, who is able to know the causes of things." >> > -Virgil* >> > > > > >> > > > > >> > > > > On Fri, Apr 22, 2016 at 1:06 PM, DuyHai Doan < >> [email protected]> >> > > > wrote: >> > > > > >> > > > > > I think we should rather let ZeppelinContext un-modified. >> > > > > > >> > > > > > If we update ZeppelinContext for every kind of interpreter, it >> > would >> > > > > become >> > > > > > quickly a behemoth and un-manageable. >> > > > > > >> > > > > > The reason ZeppelinContext has some support for Spark is because >> > it's >> > > > > > historical. Now that the project is going to gain wider >> audience, >> > we >> > > > > should >> > > > > > focus on keeping the code as cleanest and as modular as >> possible. >> > > > > > >> > > > > > Can you explain which feature you want to add to ZeppelinContext >> > that >> > > > > will >> > > > > > be useful for Flink ? >> > > > > > >> > > > > > >> > > > > > >> > > > > > On Fri, Apr 22, 2016 at 7:12 PM, Trevor Grant < >> > > > [email protected]> >> > > > > > wrote: >> > > > > > >> > > > > > > If one were to extend the Zeppelin context for Flink, I was >> > > thinking >> > > > it >> > > > > > > would make the most sense to update >> > > > > > > >> > > > > > > >> > > ../spark/src/main/java/org/apache/zeppelin/spark/ZeppelinContext.java >> > > > > > > >> > > > > > > Any thoughts from those who are more familiar with that end of >> > the >> > > > code >> > > > > > > base than I? >> > > > > > > >> > > > > > > Ideally we'd have a solution that extend the Zeppelin Context >> to >> > > all >> > > > > > > interpreters. I know y'all love Spark but there ARE others >> out >> > > > > there... >> > > > > > > >> > > > > > > Anyone have any branches / previous attempts I could check >> out? >> > > > > > > >> > > > > > > tg >> > > > > > > >> > > > > > > >> > > > > > > Trevor Grant >> > > > > > > Data Scientist >> > > > > > > https://github.com/rawkintrevo >> > > > > > > http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > > http://trevorgrant.org >> > > > > > > >> > > > > > > *"Fortunate is he, who is able to know the causes of things." >> > > > -Virgil* >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
