Re: Helium Registry Gone?
Thanks for the updated file. In our version (0.9.0) it did not quite work, failing as follows: Caused by: com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was BEGIN_ARRAY at line 1 column 2 path $ at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:226) at com.google.gson.Gson.fromJson(Gson.java:932) at com.google.gson.Gson.fromJson(Gson.java:897) at com.google.gson.Gson.fromJson(Gson.java:846) at com.google.gson.Gson.fromJson(Gson.java:817) at org.apache.zeppelin.helium.HeliumConf.fromJson(HeliumConf.java:104) at org.apache.zeppelin.helium.Helium.loadConf(Helium.java:151) at org.apache.zeppelin.helium.Helium.(Helium.java:98) at org.apache.zeppelin.helium.Helium.(Helium.java:74) I then removed [] around the file content, and I get no errors, but no Helium charts either. Was there some change in file format since 0.9? On Tue, Oct 10, 2023 at 6:53 PM Jongyoul Lee wrote: > Hello, > > It's too late but I would like to share the latest helium.json with you. > > I'll update URL and the docs as well. > > Best regards, > Jongyoul > > 2023년 9월 22일 (금) 오후 7:39, Vladimir Prus 님이 작성: > >> Thanks, looking forward! >> >> On Fri, Sep 22, 2023 at 1:16 PM Jongyoul Lee wrote: >> >>> Hello, >>> >>> You can install helium.json in your local. >>> >>> I have the latest one and will attach it soon. Then, for the future >>> release, I’ll change the location to github repo or similar location. >>> >>> Best regards, >>> Jongyoul >>> >>> 2023년 9월 22일 (금) 오후 7:13, Vladimir Prus 님이 작성: >>> >>>> >>>> Hi, >>>> >>>> Zeppelin used to fetch Helium visualization plugins from the following >>>> URL >>>> >>>> https://s3.amazonaws.com/helium-package/helium.json >>>> >>>> As of now, that URL returns "All access to this object has been >>>> disabled" error, and I don't see >>>> any alternative URLs in the current code. >>>> >>>> We do enjoy some of the Helium visualizations, such as heatmap. Any >>>> suggestions how to get them back in the most future-proof way? >>>> >>>> >>>> -- >>>> Vladimir Prus >>>> http://vladimirprus.com >>>> >>> >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > > > -- > 이종열, Jongyoul Lee, 李宗烈 > http://madeng.net > -- Vladimir Prus http://vladimirprus.com
Re: Helium Registry Gone?
Thanks, looking forward! On Fri, Sep 22, 2023 at 1:16 PM Jongyoul Lee wrote: > Hello, > > You can install helium.json in your local. > > I have the latest one and will attach it soon. Then, for the future > release, I’ll change the location to github repo or similar location. > > Best regards, > Jongyoul > > 2023년 9월 22일 (금) 오후 7:13, Vladimir Prus 님이 작성: > >> >> Hi, >> >> Zeppelin used to fetch Helium visualization plugins from the following URL >> >> https://s3.amazonaws.com/helium-package/helium.json >> >> As of now, that URL returns "All access to this object has been disabled" >> error, and I don't see >> any alternative URLs in the current code. >> >> We do enjoy some of the Helium visualizations, such as heatmap. Any >> suggestions how to get them back in the most future-proof way? >> >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > -- Vladimir Prus http://vladimirprus.com
Helium Registry Gone?
Hi, Zeppelin used to fetch Helium visualization plugins from the following URL https://s3.amazonaws.com/helium-package/helium.json As of now, that URL returns "All access to this object has been disabled" error, and I don't see any alternative URLs in the current code. We do enjoy some of the Helium visualizations, such as heatmap. Any suggestions how to get them back in the most future-proof way? -- Vladimir Prus http://vladimirprus.com
Optimizing Spark interpreter startup
Hi, I was profiling the startup time of Spark Interpreter in our environment, and it looks like a total of 5 seconds is spent at this line in SparkScala212Interpreter.scala: sparkILoop.initializeSynchronous() That line, eventually, calls nsc.Global constructor, which spends 5 seconds creating mirrors of every class on the classpath. Obviously, most users will never care about most of those classes. Any ideas on how this can be sped up, maybe by only looking at key spark classes? [image: image.png] -- Vladimir Prus http://vladimirprus.com
Re: IPv6 in Zeppelin
John, It seems that your question is a bit unclear. What *exactly* do you mean by IPv6 system? Which functionality of Zeppelin as you asking about? Why do you require a confirmation about that functionality, as opposed to just trying it yourself? If you have a network that only has IPv6 addresses, it would be easy to run Zeppelin and double check that it accepts connections and that basic interpreters work. And of course, even in that case you can run a reverse proxy on the same server, so Zeppelin can continue listening on an IPv4 address locally. On Tue, Jan 25, 2022 at 10:21 PM Helmsen, John wrote: > Everyone, > > > > Earlier I asked a question as to whether we could receive a definitive > answer regarding whether IPv6 works with Zeppelin. Has anyone used it in > an IPv6 system? > > > > We really need to get some type of confirmation. Even a user story might > be acceptable. > > > > John Helmsen > > Data Science Senior Manager > > Automated Analytics Division > > > > [image: signature_1002916469] > > *Noblis *| *for the best of reasons* > 2002 Edmund Halley Drive | Reston, VA 20191 | *noblis.org* > <http://www.noblis.org/> > > tel: 703.610.2004| cell: 240.899.5676 | john.helm...@noblis.org > > > -- Vladimir Prus http://vladimirprus.com
Updating search index when editing paragraph
Hi, I am observing that the search index is not getting updated when editing a paragraph. For example, in one tab I add "foobar" in a paragraph. A bit later, in another tab, I search for "foobar" and nothing is found. Enabling debug logging for search, with log4j.logger.org.apache.zeppelin.search = DEBUG reveals that when a note is updated, it's not added to the index at all. It appears that Note.fireParagraphUpdateEvent is supposed to call SearchService. However, the only place where that method is called is in a test - it does not appear to be called during commit paragraph operation. At this point, I am a bit stuck. What would be an appropriate place to call SearchService when a note is updated? -- Vladimir Prus http://vladimirprus.com
Paragraph content is reset
Hi, lots of colleagues (myself included) are observing the following annoying behaviour: - you are busy typing fancy Spark code in a notebook - all of sudden, recently written code disappears and cursor jumps to the start of the paragraph The cursor jump suggests that paragraph text is unintentionally updated, and looking at console logs suggests that maybe, UI sends "commit paragraph" to the server, receives new paragraph, and updates the text in UI to an earlier version. So, I looked at the code in paragraph.controller.js and see this if ($scope.dirtyText === newPara.text) { // when local update is the same from remote, clear local update $scope.paragraph.text = newPara.text; $scope.dirtyText = undefined; $scope.originalText = angular.copy(newPara.text); } else { // if there're local update, keep it. $scope.paragraph.text = newPara.text; } It seems there's the intention to preserve local changes, but then the last line still assigns newPara.text to paragraph. Is this just a thinko and the last line is a bug, and must be basically removed (so keep current paragraph.text and dirtyText). Or am I misunderstanding all this? -- Vladimir Prus http://vladimirprus.com
Re: Status of new UI
Hi Eric and Jeff, thanks for the clarification - it's helpful. Sadly, I moved away from UI back to backend engineering (and need to update the website), and my UI colleagues are all using React. It is unlikely we can contribute anything major. On Fri, May 21, 2021 at 5:01 PM Eric Pugh wrote: > I wanted to add my encouragement Vladimir…. A lot of these “data > analytics/backend system” project suffer by not having enough front end > passionate people. I just checked out your website, and you lead with “UX > Designer” ;-). I suspect you could contribute a lot to Zeppelin! > > Eric > > > On May 21, 2021, at 8:26 AM, Jeff Zhang wrote: > > Hi Vladimir, > > I don't think the new UI is ready for production, the development has > ceased for a while. For now most of the active contributors of Zeppelin are > backend guys. We would be very appreciated if you or anyone else could > help continue the development of new UI, > > Vladimir Prus 于2021年5月21日周五 下午6:30写道: > >> Hi, >> >> I see that Zeppelin has new in-development UI (available view "Try the >> new Zeppelin" link). >> >> Is there any overview of its status? E.g. is it ready for users? And if >> it's not, what is approximate timeline when it will be ready? >> >> I'm asking because I'm making some fixes (initially, locally), and I >> wonder whether I should do it >> in the current UI, or jump to the new one right away. >> >> Thanks, >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > > > -- > Best Regards > > Jeff Zhang > > > ___ > *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467 > | http://www.opensourceconnections.com | My Free/Busy > <http://tinyurl.com/eric-cal> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > > -- Vladimir Prus http://vladimirprus.com
Status of new UI
Hi, I see that Zeppelin has new in-development UI (available view "Try the new Zeppelin" link). Is there any overview of its status? E.g. is it ready for users? And if it's not, what is approximate timeline when it will be ready? I'm asking because I'm making some fixes (initially, locally), and I wonder whether I should do it in the current UI, or jump to the new one right away. Thanks, -- Vladimir Prus http://vladimirprus.com
Re: Custom init for Spark interpreter
Jeff, thanks for the response. I've created https://issues.apache.org/jira/browse/ZEPPELIN-5386 and will see whether I can make a generic patch for this. On Fri, May 21, 2021 at 11:21 AM Jeff Zhang wrote: > Right,we have hooks for each paragraph execution, but no interpreter > process level hook. Could you create a ticket for that ? And welcome to > contribute. > > Vladimir Prus 于2021年5月21日周五 下午4:16写道: > >> >> Hi, >> >> is there a way, when using Spark interpreter, to always run additional >> Scala code after startup? E.g. I want to automatically execute >> >> import com.joom.whatever._ >> >> so that users don't have to do it all the time. I see that >> BaseSparkScalaInterpreter.spark2CreateContext imports a few thing, but the >> code does not appear to support customizing this sequence. >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > > > -- > Best Regards > > Jeff Zhang > -- Vladimir Prus http://vladimirprus.com
Custom init for Spark interpreter
Hi, is there a way, when using Spark interpreter, to always run additional Scala code after startup? E.g. I want to automatically execute import com.joom.whatever._ so that users don't have to do it all the time. I see that BaseSparkScalaInterpreter.spark2CreateContext imports a few thing, but the code does not appear to support customizing this sequence. -- Vladimir Prus http://vladimirprus.com
All paragraphs are disable during "Run All"
Hi, in Zeppelin 0.9, if I use "Run All" on a note, then all paragraphs are disabled. While that is a reasonable default, it's often the case that a couple of first paragraphs prepare data and take a long time, while you might think about a change in subsequent paragraphs, or maybe even add another. Is it possible to somehow configure the old behavior, where "Run All" does not disable paragraphs that are not executing? Thanks, -- Vladimir Prus http://vladimirprus.com
Re: Notes no longer show full path
Jeff, thanks for the response. Users noted another side effect - on the index page there is a filter for notebooks. We have notebooks organized by team or user names, and people used to type their username in the filter, e.g. "john/" to quickly see their notebooks. This no longer works, so they have to scroll through 150 top-level folders. On Fri, Mar 12, 2021 at 12:58 PM Jeff Zhang wrote: > I think this is done in one ticket, but I think we should reconsider > whether it makes sense to do that. > And one side effect of displaying the full path is that the user can not > see the full path if it is very long, I create a ticket to just display the > title in a separate line. > > https://issues.apache.org/jira/browse/ZEPPELIN-5264 > > > > Vladimir Prus 于2021年3月12日周五 下午5:16写道: > >> Hi, >> >> In Zeppelin 0.9 the note storage was changed to use hierarchical naming >> on S3, which is a much >> welcome change. However, it seems to have an undesirable effect. Say, I >> create a note with name "a/b/c" and open it. The UI has "c" as the name (I >> attach a screenshot), which has two issues: >> >> - One no longer knows where in the hierarchy they are. For example, one >> can have "Reports/Q1/orders", and in UI it is simply shown as "orders", so >> looking at this note it's no longer obvious where we are - and my >> colleagues find that very confusing >> >> - One no longer can move a note by editing its name - one should go back >> to index page, expand the hierarchy, and use rename there - also >> inconvenient. >> >> I could not find any issues to improve this - is there some setting or >> workaround I can use? >> >> >> [image: image.png] >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > > > -- > Best Regards > > Jeff Zhang > -- Vladimir Prus http://vladimirprus.com
Notes no longer show full path
Hi, In Zeppelin 0.9 the note storage was changed to use hierarchical naming on S3, which is a much welcome change. However, it seems to have an undesirable effect. Say, I create a note with name "a/b/c" and open it. The UI has "c" as the name (I attach a screenshot), which has two issues: - One no longer knows where in the hierarchy they are. For example, one can have "Reports/Q1/orders", and in UI it is simply shown as "orders", so looking at this note it's no longer obvious where we are - and my colleagues find that very confusing - One no longer can move a note by editing its name - one should go back to index page, expand the hierarchy, and use rename there - also inconvenient. I could not find any issues to improve this - is there some setting or workaround I can use? [image: image.png] -- Vladimir Prus http://vladimirprus.com
Re: PySpark PYTHONPATH issue after 0.8 to 0.9 upgrade
I have figured the problem. - With the refactoring above, PYTHONPATH is set only in interpreter.sh and no longer set in PySparkInterpreter - I use sudo for impersonation - By default, sudo does not preserve environment, so I use "sudo -E" - But it *still* explicitly drops PYTHONPATH, as described in http://kmiku7.github.io/2019/09/20/Keep-environment-variables-with-sudo-command/ So, one either should reconfigure sudo, or do "sudo PYTHONPATH=${PYTHONPATH} ". On Thu, Mar 11, 2021 at 7:27 PM Vladimir Prus wrote: > Hi, > > we've upgraded from 0.8 to 0.9 and I observe that with the same > interpreter settings, > PySpark no longer works with: > > java.io.IOException: Fail to launch python process. > > Traceback (most recent call last): > > File "/tmp/1615477929423-0/zeppelin_python.py", line 20, in > > from py4j.java_gateway import java_import, JavaGateway, > GatewayClient > > ModuleNotFoundError: No module named 'py4j' > > Comparing logs, I see that for 0.8: > > INFO [2021-03-11 15:40:13,565] ({pool-3-thread-5} > PySparkInterpreter.java[createGatewayServerAndStartScript]:265) â > pythonExec: > /mnt/conda/envs/zeppelin-pyspark-python3/bin/python > > INFO [2021-03-11 15:40:13,585] ({pool-3-thread-5} > PySparkInterpreter.java[setupPySparkEnv]:236) â PYTHONPATH: > /usr/lib/spark/python/lib/pyspark.zip:/usr/lib/spark/python/lib/py4j-0.10.7-src.zip:/mnt/zeppelin-0.8.3-SNAPSHOT/../interpreter/lib/python > > Whereas 0.9 logs say: > > INFO [2021-03-11 15:52:09,428] > ({FIFOScheduler-interpreter_293940413-Worker-1} > PythonInterpreter.java[setupPythonEnv]:212) - PYTHONPATH: > /tmp/1615477929423-0 > > INFO [2021-03-11 15:52:09,428] > ({FIFOScheduler-interpreter_293940413-Worker-1} > PythonInterpreter.java[createGatewayServerAndStartScript]:147) - Launching > Python Process Command: /mnt/conda/envs/zeppelin-pyspark-python3/bin/python > /tmp/1615477929423-0/zeppelin_python.py 10.4.2.199 37753 > > > In other words, looks like 0.9 does not add pyspark zips to PYTHONPATH. > Looking at the history, I see a major refactoring in this area: > > > https://github.com/apache/zeppelin/commit/0a97446a70f6294a3efb071bb9a70601f885840b > > But can't quite understand whether this change in behavour is intentional, > and what additional options I might need to set. Does anybody have any > suggestions? > > -- > Vladimir Prus > http://vladimirprus.com > -- Vladimir Prus http://vladimirprus.com
PySpark PYTHONPATH issue after 0.8 to 0.9 upgrade
Hi, we've upgraded from 0.8 to 0.9 and I observe that with the same interpreter settings, PySpark no longer works with: java.io.IOException: Fail to launch python process. Traceback (most recent call last): File "/tmp/1615477929423-0/zeppelin_python.py", line 20, in from py4j.java_gateway import java_import, JavaGateway, GatewayClient ModuleNotFoundError: No module named 'py4j' Comparing logs, I see that for 0.8: INFO [2021-03-11 15:40:13,565] ({pool-3-thread-5} PySparkInterpreter.java[createGatewayServerAndStartScript]:265) â pythonExec: /mnt/conda/envs/zeppelin-pyspark-python3/bin/python INFO [2021-03-11 15:40:13,585] ({pool-3-thread-5} PySparkInterpreter.java[setupPySparkEnv]:236) â PYTHONPATH: /usr/lib/spark/python/lib/pyspark.zip:/usr/lib/spark/python/lib/py4j-0.10.7-src.zip:/mnt/zeppelin-0.8.3-SNAPSHOT/../interpreter/lib/python Whereas 0.9 logs say: INFO [2021-03-11 15:52:09,428] ({FIFOScheduler-interpreter_293940413-Worker-1} PythonInterpreter.java[setupPythonEnv]:212) - PYTHONPATH: /tmp/1615477929423-0 INFO [2021-03-11 15:52:09,428] ({FIFOScheduler-interpreter_293940413-Worker-1} PythonInterpreter.java[createGatewayServerAndStartScript]:147) - Launching Python Process Command: /mnt/conda/envs/zeppelin-pyspark-python3/bin/python /tmp/1615477929423-0/zeppelin_python.py 10.4.2.199 37753 In other words, looks like 0.9 does not add pyspark zips to PYTHONPATH. Looking at the history, I see a major refactoring in this area: https://github.com/apache/zeppelin/commit/0a97446a70f6294a3efb071bb9a70601f885840b But can't quite understand whether this change in behavour is intentional, and what additional options I might need to set. Does anybody have any suggestions? -- Vladimir Prus http://vladimirprus.com
Re: Zeppelin behind authenticating proxy
I have tried some permutations, and one of them ended up working fine, so shiro-remote-user appears to be perfectly OK, thanks. (Still no idea what is wrong in my original setup, but it involved two proxies, and a load balancer, and one of them must be messing up some part of protocol) On Fri, Feb 12, 2021 at 10:46 PM Vladimir Prus wrote: > Hi, > > that seems exactly what I was looking for. I gave it a try, and got > half-way through: > > - Zeppelin shows the username I set in header, and websocket is connected, > and I can use menu with no issues > - The main content is however empty - I see no list of notebooks at all. > Looks at websocket messages, I see LIST_NOTES that returns > empty list of notes. I have verified that if I revert shiro.ini to my > previous version (which uses ldap), the list of notebooks is present. > > Does this point to some obvious misconfiguration on my side? > > On Fri, Feb 12, 2021 at 3:12 AM moon soo Lee wrote: > >> Hi, >> >> I haven't tried it personally, but this repository might help >> https://github.com/leighklotz/shiro-remote-user >> >> Thanks, >> moon >> >> >> >> On Tue, Feb 9, 2021 at 3:25 AM Vladimir Prus >> wrote: >> >>> Hi, >>> >>> I would like to run Zeppelin behind authenticating proxy, so that: >>> >>> - The proxy handles all authentication, including setting a cookie to >>> remember the user >>> - It passes a username header to Zeppelin >>> - Zeppelin takes that username header and trusts it - it should show the >>> user as >>> authorized and use that username when starting interpreter or evaluating >>> notebook >>> permissions >>> >>> While the documentation mentions how to setup nginx as proxy, I can't >>> find any information about the second part - passing username to >>> Zeppelin, and actually using it. >>> Shiro documentation is likewise not helpful. >>> >>> How can I accomplish what I want? >>> >>> -- >>> Vladimir Prus >>> http://vladimirprus.com >>> >> > > -- > Vladimir Prus > http://vladimirprus.com > -- Vladimir Prus http://vladimirprus.com
Re: Zeppelin behind authenticating proxy
Hi, that seems exactly what I was looking for. I gave it a try, and got half-way through: - Zeppelin shows the username I set in header, and websocket is connected, and I can use menu with no issues - The main content is however empty - I see no list of notebooks at all. Looks at websocket messages, I see LIST_NOTES that returns empty list of notes. I have verified that if I revert shiro.ini to my previous version (which uses ldap), the list of notebooks is present. Does this point to some obvious misconfiguration on my side? On Fri, Feb 12, 2021 at 3:12 AM moon soo Lee wrote: > Hi, > > I haven't tried it personally, but this repository might help > https://github.com/leighklotz/shiro-remote-user > > Thanks, > moon > > > > On Tue, Feb 9, 2021 at 3:25 AM Vladimir Prus > wrote: > >> Hi, >> >> I would like to run Zeppelin behind authenticating proxy, so that: >> >> - The proxy handles all authentication, including setting a cookie to >> remember the user >> - It passes a username header to Zeppelin >> - Zeppelin takes that username header and trusts it - it should show the >> user as >> authorized and use that username when starting interpreter or evaluating >> notebook >> permissions >> >> While the documentation mentions how to setup nginx as proxy, I can't >> find any information about the second part - passing username to >> Zeppelin, and actually using it. >> Shiro documentation is likewise not helpful. >> >> How can I accomplish what I want? >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > -- Vladimir Prus http://vladimirprus.com
Re: Kubernetes and interpreter dependencies
Hi Jeff, thanks for the answer. Copying the relevant jars to the docker image does indeed allow me to get JDBC/Athena queries to work, thanks! On Wed, Feb 10, 2021 at 6:13 PM Jeff Zhang wrote: > Hi Vladimir, > > That's right, the dependencies download mechanism won't work for k8s. > Because when introducing this feature, K8s is not considered. I think we > need to fix it. > > For now, you can just download them by yourself and put them in the > docker image to make it work in k8s. > > > > > Vladimir Prus 于2021年2月10日周三 下午10:44写道: > >> >> Hi, >> >> I was experimenting with Zeppelin + Spark in K8S, and everything worked >> fine, but now I'm also trying to configure jdbc interpreter that needs a >> custom jar with JDBC driver. Previously, I'd just specify interpreter >> dependencies and it worked. In K8S, I observe that >> - The pod with zeppelin server downloads the dependency to >> /zeppelin/local-repo >> - The pod with jdbc interpreter is merely started with zeppelin image, so >> can't find those jars >> >> Am I misconfiguring something? Are there alternative approaches to >> accomplish this? >> >> -- >> Vladimir Prus >> http://vladimirprus.com >> > > > -- > Best Regards > > Jeff Zhang > -- Vladimir Prus http://vladimirprus.com
Kubernetes and interpreter dependencies
Hi, I was experimenting with Zeppelin + Spark in K8S, and everything worked fine, but now I'm also trying to configure jdbc interpreter that needs a custom jar with JDBC driver. Previously, I'd just specify interpreter dependencies and it worked. In K8S, I observe that - The pod with zeppelin server downloads the dependency to /zeppelin/local-repo - The pod with jdbc interpreter is merely started with zeppelin image, so can't find those jars Am I misconfiguring something? Are there alternative approaches to accomplish this? -- Vladimir Prus http://vladimirprus.com
Zeppelin behind authenticating proxy
Hi, I would like to run Zeppelin behind authenticating proxy, so that: - The proxy handles all authentication, including setting a cookie to remember the user - It passes a username header to Zeppelin - Zeppelin takes that username header and trusts it - it should show the user as authorized and use that username when starting interpreter or evaluating notebook permissions While the documentation mentions how to setup nginx as proxy, I can't find any information about the second part - passing username to Zeppelin, and actually using it. Shiro documentation is likewise not helpful. How can I accomplish what I want? -- Vladimir Prus http://vladimirprus.com