Hi Mich, By default, pip caches downloaded binaries to somewhere like $HOME/.cache/pip. So after doing any "pip install", you'll want to either delete that directory, or pass the "--no-cache-dir" option to pip to prevent the download binaries from being added to the image.
HTH Andrew On Tue, Aug 17, 2021 at 2:29 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Andrew, > > Can you please elaborate on blowing pip cache before committing the layer? > > Thanks, > > Much > > On Tue, 17 Aug 2021 at 16:57, Andrew Melo <andrew.m...@gmail.com> wrote: > >> Silly Q, did you blow away the pip cache before committing the layer? >> That always trips me up. >> >> Cheers >> Andrew >> >> On Tue, Aug 17, 2021 at 10:56 Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >>> With no additional python packages etc we get 1.4GB compared to 2.19GB >>> before >>> >>> REPOSITORY TAG IMAGE ID >>> CREATED SIZE >>> spark/spark-py 3.1.1_sparkpy_3.7-scala_2.12-java8only faee4dbb95dd >>> Less than a second ago 1.41GB >>> spark/spark-py 3.1.1_sparkpy_3.7-scala_2.12-java8 ba3c17bc9337 >>> 4 hours ago 2.19GB >>> >>> root@233a81199b43:/opt/spark/work-dir# pip list >>> Package Version >>> ------------- ------- >>> asn1crypto 0.24.0 >>> cryptography 2.6.1 >>> entrypoints 0.3 >>> keyring 17.1.1 >>> keyrings.alt 3.1.1 >>> pip 21.2.4 >>> pycrypto 2.6.1 >>> PyGObject 3.30.4 >>> pyxdg 0.25 >>> SecretStorage 2.3.1 >>> setuptools 57.4.0 >>> six 1.12.0 >>> wheel 0.32.3 >>> >>> >>> HTH >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Tue, 17 Aug 2021 at 16:24, Mich Talebzadeh <mich.talebza...@gmail.com> >>> wrote: >>> >>>> Yes, I will double check. it includes java 8 in addition to base java >>>> 11. >>>> >>>> in addition it has these Python packages for now (added for my own >>>> needs for now) >>>> >>>> root@ce6773017a14:/opt/spark/work-dir# pip list >>>> Package Version >>>> ------------- ------- >>>> asn1crypto 0.24.0 >>>> cryptography 2.6.1 >>>> cx-Oracle 8.2.1 >>>> entrypoints 0.3 >>>> keyring 17.1.1 >>>> keyrings.alt 3.1.1 >>>> numpy 1.21.2 >>>> pip 21.2.4 >>>> py4j 0.10.9 >>>> pycrypto 2.6.1 >>>> PyGObject 3.30.4 >>>> pyspark 3.1.2 >>>> pyxdg 0.25 >>>> PyYAML 5.4.1 >>>> SecretStorage 2.3.1 >>>> setuptools 57.4.0 >>>> six 1.12.0 >>>> wheel 0.32.3 >>>> >>>> >>>> HTH >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> >>>> On Tue, 17 Aug 2021 at 16:17, Maciej <mszymkiew...@gmail.com> wrote: >>>> >>>>> Quick question ‒ is this actual output? If so, do we know what >>>>> accounts 1.5GB overhead for PySpark image. Even without >>>>> --no-install-recommends this seems like a lot (if I recall correctly >>>>> it was around 400MB for existing images). >>>>> >>>>> >>>>> On 8/17/21 2:24 PM, Mich Talebzadeh wrote: >>>>> >>>>> Examples: >>>>> >>>>> *docker images* >>>>> >>>>> REPOSITORY TAG IMAGE ID >>>>> CREATED SIZE >>>>> >>>>> spark/spark-py 3.1.1_sparkpy_3.7-scala_2.12-java8 ba3c17bc9337 2 >>>>> minutes ago 2.19GB >>>>> >>>>> spark 3.1.1-scala_2.12-java11 4595c4e78879 >>>>> 18 minutes ago 635MB >>>>> >>>>> >>>>> view my Linkedin profile >>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, 17 Aug 2021 at 10:31, Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> 3.1.2_sparkpy_3.7-scala_2.12-java11 >>>>>> >>>>>> 3.1.2_sparkR_3.6-scala_2.12-java11 >>>>>> Yes let us go with that and remember that we can change the tags >>>>>> anytime. The accompanying release note should detail what is inside the >>>>>> image downloaded. >>>>>> >>>>>> +1 for me >>>>>> >>>>>> >>>>>> view my Linkedin profile >>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>> >>>>>> >>>>>> >>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>> for any loss, damage or destruction of data or any other property which >>>>>> may >>>>>> arise from relying on this email's technical content is explicitly >>>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>>> arising from such loss, damage or destruction. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, 17 Aug 2021 at 09:51, Maciej <mszymkiew...@gmail.com> wrote: >>>>>> >>>>>>> On 8/17/21 4:04 AM, Holden Karau wrote: >>>>>>> >>>>>>> These are some really good points all around. >>>>>>> >>>>>>> I think, in the interest of simplicity, well start with just the 3 >>>>>>> current Dockerfiles in the Spark repo but for the next release (3.3) we >>>>>>> should explore adding some more Dockerfiles/build options. >>>>>>> >>>>>>> Sounds good. >>>>>>> >>>>>>> However, I'd consider adding guest lang version to the tag names, >>>>>>> i.e. >>>>>>> >>>>>>> 3.1.2_sparkpy_3.7-scala_2.12-java11 >>>>>>> >>>>>>> 3.1.2_sparkR_3.6-scala_2.12-java11 >>>>>>> >>>>>>> and some basics safeguards in the layers, to make sure that these >>>>>>> are really the versions we use. >>>>>>> >>>>>>> On Mon, Aug 16, 2021 at 10:46 AM Maciej <mszymkiew...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> I have a few concerns regarding PySpark and SparkR images. >>>>>>>> >>>>>>>> First of all, how do we plan to handle interpreter versions? >>>>>>>> Ideally, we should provide images for all supported variants, but >>>>>>>> based on >>>>>>>> the preceding discussion and the proposed naming convention, I assume >>>>>>>> it is >>>>>>>> not going to happen. If that's the case, it would be great if we could >>>>>>>> fix >>>>>>>> interpreter versions based on some support criteria (lowest supported, >>>>>>>> lowest non-deprecated, highest supported at the time of release, etc.) >>>>>>>> >>>>>>>> Currently, we use the following: >>>>>>>> >>>>>>>> - for R use buster-cran35 Debian repositories which install R >>>>>>>> 3.6 (provided version already changed in the past and broke image >>>>>>>> build ‒ >>>>>>>> SPARK-28606). >>>>>>>> - for Python we depend on the system provided python3 packages, >>>>>>>> which currently provides Python 3.7. >>>>>>>> >>>>>>>> which don't guarantee stability over time and might be hard to >>>>>>>> synchronize with our support matrix. >>>>>>>> >>>>>>>> Secondly, omitting libraries which are required for the full >>>>>>>> functionality and performance, specifically >>>>>>>> >>>>>>>> - Numpy, Pandas and Arrow for PySpark >>>>>>>> - Arrow for SparkR >>>>>>>> >>>>>>>> is likely to severely limit usability of the images (out of these, >>>>>>>> Arrow is probably the hardest to manage, especially when you already >>>>>>>> depend >>>>>>>> on system packages to provide R or Python interpreter). >>>>>>>> >>>>>>>> On 8/14/21 12:43 AM, Mich Talebzadeh wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> We can cater for multiple types (spark, spark-py and spark-r) and >>>>>>>> spark versions (assuming they are downloaded and available). >>>>>>>> The challenge is that these docker images built are snapshots. They >>>>>>>> cannot be amended later and if you change anything by going inside >>>>>>>> docker, >>>>>>>> as soon as you are logged out whatever you did is reversed. >>>>>>>> >>>>>>>> For example, I want to add tensorflow to my docker image. These are >>>>>>>> my images >>>>>>>> >>>>>>>> REPOSITORY TAG IMAGE ID >>>>>>>> CREATED SIZE >>>>>>>> eu.gcr.io/axial-glow-224522/spark-py java8_3.1.1 >>>>>>>> cfbb0e69f204 5 days ago 2.37GB >>>>>>>> eu.gcr.io/axial-glow-224522/spark 3.1.1 >>>>>>>> 8d1bf8e7e47d 5 days ago 805MB >>>>>>>> >>>>>>>> using image ID I try to log in as root to the image >>>>>>>> >>>>>>>> *docker run -u0 -it cfbb0e69f204 bash* >>>>>>>> >>>>>>>> root@b542b0f1483d:/opt/spark/work-dir# pip install keras >>>>>>>> Collecting keras >>>>>>>> Downloading keras-2.6.0-py2.py3-none-any.whl (1.3 MB) >>>>>>>> |████████████████████████████████| 1.3 MB 1.1 MB/s >>>>>>>> Installing collected packages: keras >>>>>>>> Successfully installed keras-2.6.0 >>>>>>>> WARNING: Running pip as the 'root' user can result in broken >>>>>>>> permissions and conflicting behaviour with the system package manager. >>>>>>>> It >>>>>>>> is recommended to use a virtual environment instead: >>>>>>>> https://pip.pypa.io/warnings/venv >>>>>>>> root@b542b0f1483d:/opt/spark/work-dir# pip list >>>>>>>> Package Version >>>>>>>> ------------- ------- >>>>>>>> asn1crypto 0.24.0 >>>>>>>> cryptography 2.6.1 >>>>>>>> cx-Oracle 8.2.1 >>>>>>>> entrypoints 0.3 >>>>>>>> *keras 2.6.0 <--- it is here* >>>>>>>> keyring 17.1.1 >>>>>>>> keyrings.alt 3.1.1 >>>>>>>> numpy 1.21.1 >>>>>>>> pip 21.2.3 >>>>>>>> py4j 0.10.9 >>>>>>>> pycrypto 2.6.1 >>>>>>>> PyGObject 3.30.4 >>>>>>>> pyspark 3.1.2 >>>>>>>> pyxdg 0.25 >>>>>>>> PyYAML 5.4.1 >>>>>>>> SecretStorage 2.3.1 >>>>>>>> setuptools 57.4.0 >>>>>>>> six 1.12.0 >>>>>>>> wheel 0.32.3 >>>>>>>> root@b542b0f1483d:/opt/spark/work-dir# exit >>>>>>>> >>>>>>>> Now I exited from the image and try to log in again >>>>>>>> (pyspark_venv) hduser@rhes76: /home/hduser/dba/bin/build> docker >>>>>>>> run -u0 -it cfbb0e69f204 bash >>>>>>>> >>>>>>>> root@5231ee95aa83:/opt/spark/work-dir# pip list >>>>>>>> Package Version >>>>>>>> ------------- ------- >>>>>>>> asn1crypto 0.24.0 >>>>>>>> cryptography 2.6.1 >>>>>>>> cx-Oracle 8.2.1 >>>>>>>> entrypoints 0.3 >>>>>>>> keyring 17.1.1 >>>>>>>> keyrings.alt 3.1.1 >>>>>>>> numpy 1.21.1 >>>>>>>> pip 21.2.3 >>>>>>>> py4j 0.10.9 >>>>>>>> pycrypto 2.6.1 >>>>>>>> PyGObject 3.30.4 >>>>>>>> pyspark 3.1.2 >>>>>>>> pyxdg 0.25 >>>>>>>> PyYAML 5.4.1 >>>>>>>> SecretStorage 2.3.1 >>>>>>>> setuptools 57.4.0 >>>>>>>> six 1.12.0 >>>>>>>> wheel 0.32.3 >>>>>>>> >>>>>>>> *Hm that keras is not there*. The docker Image cannot be altered >>>>>>>> after build! So once the docker image is created that is just a >>>>>>>> snapshot. >>>>>>>> However, it will still have tons of useful stuff for most >>>>>>>> users/organisations. My suggestions is to create for a given type >>>>>>>> (spark, >>>>>>>> spark-py etc): >>>>>>>> >>>>>>>> >>>>>>>> 1. One vanilla flavour for everyday use with few useful packages >>>>>>>> 2. One for medium use with most common packages for ETL/ELT >>>>>>>> stuff >>>>>>>> 3. One specialist for ML etc with keras, tensorflow and >>>>>>>> anything else needed >>>>>>>> >>>>>>>> >>>>>>>> These images should be maintained as we currently maintain spark >>>>>>>> releases with accompanying documentation. Any reason why we cannot >>>>>>>> maintain >>>>>>>> ourselves? >>>>>>>> >>>>>>>> HTH >>>>>>>> >>>>>>>> view my Linkedin profile >>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>>> for any loss, damage or destruction of data or any other property >>>>>>>> which may >>>>>>>> arise from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>> damages >>>>>>>> arising from such loss, damage or destruction. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, 13 Aug 2021 at 17:26, Holden Karau <hol...@pigscanfly.ca> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> So we actually do have a script that does the build already it's >>>>>>>>> more a matter of publishing the results for easier use. Currently the >>>>>>>>> script produces three images spark, spark-py, and spark-r. I can >>>>>>>>> certainly >>>>>>>>> see a solid reason to publish like with a jdk11 & jdk8 suffix as well >>>>>>>>> if >>>>>>>>> there is interest in the community. If we want to have a say >>>>>>>>> spark-py-pandas for a Spark container image with everything necessary >>>>>>>>> for >>>>>>>>> the Koalas stuff to work then I think that could be a great PR from >>>>>>>>> someone >>>>>>>>> to add :) >>>>>>>>> >>>>>>>>> On Fri, Aug 13, 2021 at 1:00 AM Mich Talebzadeh < >>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> should read PySpark >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> view my Linkedin profile >>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>> other >>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>> content is >>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any >>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, 13 Aug 2021 at 08:51, Mich Talebzadeh < >>>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Agreed. >>>>>>>>>>> >>>>>>>>>>> I have already built a few latest for Spark and PYSpark on 3.1.1 >>>>>>>>>>> with Java 8 as I found out Java 11 does not work with Google >>>>>>>>>>> BigQuery data >>>>>>>>>>> warehouse. However, to hack the Dockerfile one finds out the hard >>>>>>>>>>> way. >>>>>>>>>>> >>>>>>>>>>> For example how to add additional Python libraries like >>>>>>>>>>> tensorflow etc. Loading these libraries through Kubernetes is not >>>>>>>>>>> practical >>>>>>>>>>> as unzipping and installing it through --py-files etc will >>>>>>>>>>> take considerable time so they need to be added to the dockerfile >>>>>>>>>>> at the >>>>>>>>>>> built time in directory for Python under Kubernetes >>>>>>>>>>> >>>>>>>>>>> /opt/spark/kubernetes/dockerfiles/spark/bindings/python >>>>>>>>>>> >>>>>>>>>>> RUN pip install pyyaml numpy cx_Oracle tensorflow .... >>>>>>>>>>> >>>>>>>>>>> Also you will need curl to test the ports from inside the docker >>>>>>>>>>> >>>>>>>>>>> RUN apt-get update && apt-get install -y curl >>>>>>>>>>> RUN ["apt-get","install","-y","vim"] >>>>>>>>>>> >>>>>>>>>>> As I said I am happy to build these specific dockerfiles plus >>>>>>>>>>> the complete documentation for it. I have already built one for >>>>>>>>>>> Google >>>>>>>>>>> (GCP). The difference between Spark and PySpark version is that in >>>>>>>>>>> Spark/scala a fat jar file will contain all needed. That is not the >>>>>>>>>>> case >>>>>>>>>>> with Python I am afraid. >>>>>>>>>>> >>>>>>>>>>> HTH >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> view my Linkedin profile >>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>>> other >>>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>>> content is >>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any >>>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, 13 Aug 2021 at 08:13, Bode, Meikel, NMA-CFD < >>>>>>>>>>> meikel.b...@bertelsmann.de> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I am Meikel Bode and only an interested reader of dev and user >>>>>>>>>>>> list. Anyway, I would appreciate to have official docker images >>>>>>>>>>>> available. >>>>>>>>>>>> >>>>>>>>>>>> Maybe one could get inspiration from the Jupyter docker stacks >>>>>>>>>>>> and provide an hierarchy of different images like this: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#image-relationships >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Having a core image only supporting Java, an extended >>>>>>>>>>>> supporting Python and/or R etc. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Looking forward to the discussion. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> >>>>>>>>>>>> Meikel >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *From:* Mich Talebzadeh <mich.talebza...@gmail.com> >>>>>>>>>>>> *Sent:* Freitag, 13. August 2021 08:45 >>>>>>>>>>>> *Cc:* dev <dev@spark.apache.org> >>>>>>>>>>>> *Subject:* Re: Time to start publishing Spark Docker Images? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I concur this is a good idea and certainly worth exploring. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In practice, preparing docker images as deployable will throw >>>>>>>>>>>> some challenges because creating docker for Spark is not really a >>>>>>>>>>>> singular >>>>>>>>>>>> modular unit, say creating docker for Jenkins. It involves >>>>>>>>>>>> different >>>>>>>>>>>> versions and different images for Spark and PySpark and most >>>>>>>>>>>> likely will >>>>>>>>>>>> end up as part of Kubernetes deployment. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Individuals and organisations will deploy it as the first cut. >>>>>>>>>>>> Great but I equally feel that good documentation on how to build a >>>>>>>>>>>> consumable deployable image will be more valuable. FRom my own >>>>>>>>>>>> experience >>>>>>>>>>>> the current documentation should be enhanced, for example how to >>>>>>>>>>>> deploy >>>>>>>>>>>> working directories, additional Python packages, build with >>>>>>>>>>>> different Java >>>>>>>>>>>> versions (version 8 or version 11) etc. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> HTH >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> view my Linkedin profile >>>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790679755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0CkL3HZo9FNVUOnLQ4CYs29Z9HfrwE4xDqLgVmMbr10%3D&reserved=0> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>>>> other >>>>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>>>> content is >>>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any >>>>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, 13 Aug 2021 at 01:54, Holden Karau < >>>>>>>>>>>> hol...@pigscanfly.ca> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Awesome, I've filed an INFRA ticket to get the ball rolling. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Aug 12, 2021 at 5:48 PM John Zhuge <jzh...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> +1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Aug 12, 2021 at 5:44 PM Hyukjin Kwon < >>>>>>>>>>>> gurwls...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> +1, I think we generally agreed upon having it. Thanks Holden >>>>>>>>>>>> for headsup and driving this. >>>>>>>>>>>> >>>>>>>>>>>> +@Dongjoon Hyun <dongj...@apache.org> FYI >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2021년 7월 22일 (목) 오후 12:22, Kent Yao <yaooq...@gmail.com>님이 작성: >>>>>>>>>>>> >>>>>>>>>>>> +1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Bests, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Kent Yao* >>>>>>>>>>>> >>>>>>>>>>>> @ Data Science Center, Hangzhou Research Institute, NetEase >>>>>>>>>>>> Corp. >>>>>>>>>>>> >>>>>>>>>>>> *a spark* *enthusiast* >>>>>>>>>>>> >>>>>>>>>>>> *kyuubi >>>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyaooqinn%2Fkyuubi&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790679755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZkE%2BAK4%2BUO9JsDzZlAfY5gsATCVm5hidLCp7EGxAWiY%3D&reserved=0>**is >>>>>>>>>>>> a unified* *multi-tenant* *JDBC interface for large-scale data >>>>>>>>>>>> processing and analytics,* *built on top of* *Apache Spark >>>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspark.apache.org%2F&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790689711%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4YYZ61B6datdx2GsxqnEUOpYuJUn35egYRQSVnUxtF0%3D&reserved=0>* >>>>>>>>>>>> *.* >>>>>>>>>>>> *spark-authorizer >>>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyaooqinn%2Fspark-authorizer&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790689711%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=P6TMaSh7UeXVyv79RiRqdBpipaIjh2o3DhRs0GGhWF4%3D&reserved=0>**A >>>>>>>>>>>> Spark SQL extension which provides SQL Standard Authorization for* >>>>>>>>>>>> >>>>>>>>>>>> -- >> It's dark in this basement. >> > -- > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > >