Re: Time to start publishing Spark Docker Images?

Andrew Melo Tue, 17 Aug 2021 08:58:06 -0700

Silly Q, did you blow away the pip cache before committing the layer? That
always trips me up.


Cheers
Andrew

On Tue, Aug 17, 2021 at 10:56 Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> With no additional python packages etc we get 1.4GB compared to 2.19GB
> before
>
> REPOSITORY       TAG                                      IMAGE ID
>  CREATED                  SIZE
> spark/spark-py   3.1.1_sparkpy_3.7-scala_2.12-java8only   faee4dbb95dd
>  Less than a second ago   1.41GB
> spark/spark-py   3.1.1_sparkpy_3.7-scala_2.12-java8       ba3c17bc9337   4
> hours ago              2.19GB
>
> root@233a81199b43:/opt/spark/work-dir# pip list
> Package       Version
> ------------- -------
> asn1crypto    0.24.0
> cryptography  2.6.1
> entrypoints   0.3
> keyring       17.1.1
> keyrings.alt  3.1.1
> pip           21.2.4
> pycrypto      2.6.1
> PyGObject     3.30.4
> pyxdg         0.25
> SecretStorage 2.3.1
> setuptools    57.4.0
> six           1.12.0
> wheel         0.32.3
>
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 17 Aug 2021 at 16:24, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Yes, I will double check. it includes java 8 in addition to base java 11.
>>
>> in addition it has these Python packages for now (added for my own needs
>> for now)
>>
>> root@ce6773017a14:/opt/spark/work-dir# pip list
>> Package       Version
>> ------------- -------
>> asn1crypto    0.24.0
>> cryptography  2.6.1
>> cx-Oracle     8.2.1
>> entrypoints   0.3
>> keyring       17.1.1
>> keyrings.alt  3.1.1
>> numpy         1.21.2
>> pip           21.2.4
>> py4j          0.10.9
>> pycrypto      2.6.1
>> PyGObject     3.30.4
>> pyspark       3.1.2
>> pyxdg         0.25
>> PyYAML        5.4.1
>> SecretStorage 2.3.1
>> setuptools    57.4.0
>> six           1.12.0
>> wheel         0.32.3
>>
>>
>> HTH
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Tue, 17 Aug 2021 at 16:17, Maciej <mszymkiew...@gmail.com> wrote:
>>
>>> Quick question ‒ is this actual output? If so, do we know what accounts
>>> 1.5GB overhead for PySpark image. Even without --no-install-recommends
>>> this seems like a lot (if I recall correctly it was around 400MB for
>>> existing images).
>>>
>>>
>>> On 8/17/21 2:24 PM, Mich Talebzadeh wrote:
>>>
>>> Examples:
>>>
>>> *docker images*
>>>
>>> REPOSITORY       TAG                                  IMAGE ID
>>>  CREATED          SIZE
>>>
>>> spark/spark-py   3.1.1_sparkpy_3.7-scala_2.12-java8   ba3c17bc9337   2
>>> minutes ago    2.19GB
>>>
>>> spark            3.1.1-scala_2.12-java11              4595c4e78879   18
>>> minutes ago   635MB
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Tue, 17 Aug 2021 at 10:31, Mich Talebzadeh <mich.talebza...@gmail.com>
>>> wrote:
>>>
>>>> 3.1.2_sparkpy_3.7-scala_2.12-java11
>>>>
>>>> 3.1.2_sparkR_3.6-scala_2.12-java11
>>>> Yes let us go with that and remember that we can change the tags
>>>> anytime. The accompanying release note should detail what is inside the
>>>> image downloaded.
>>>>
>>>> +1 for me
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, 17 Aug 2021 at 09:51, Maciej <mszymkiew...@gmail.com> wrote:
>>>>
>>>>> On 8/17/21 4:04 AM, Holden Karau wrote:
>>>>>
>>>>> These are some really good points all around.
>>>>>
>>>>> I think, in the interest of simplicity, well start with just the 3
>>>>> current Dockerfiles in the Spark repo but for the next release (3.3) we
>>>>> should explore adding some more Dockerfiles/build options.
>>>>>
>>>>> Sounds good.
>>>>>
>>>>> However, I'd consider adding guest lang version to the tag names, i.e.
>>>>>
>>>>> 3.1.2_sparkpy_3.7-scala_2.12-java11
>>>>>
>>>>> 3.1.2_sparkR_3.6-scala_2.12-java11
>>>>>
>>>>> and some basics safeguards in the layers, to make sure that these are
>>>>> really the versions we use.
>>>>>
>>>>> On Mon, Aug 16, 2021 at 10:46 AM Maciej <mszymkiew...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I have a few concerns regarding PySpark and SparkR images.
>>>>>>
>>>>>> First of all, how do we plan to handle interpreter versions? Ideally,
>>>>>> we should provide images for all supported variants, but based on the
>>>>>> preceding discussion and the proposed naming convention, I assume it is 
>>>>>> not
>>>>>> going to happen. If that's the case, it would be great if we could fix
>>>>>> interpreter versions based on some support criteria (lowest supported,
>>>>>> lowest non-deprecated, highest supported at the time of release, etc.)
>>>>>>
>>>>>> Currently, we use the following:
>>>>>>
>>>>>>    - for R use buster-cran35 Debian repositories which install R 3.6
>>>>>>    (provided version already changed in the past and broke image build ‒
>>>>>>    SPARK-28606).
>>>>>>    - for Python we depend on the system provided python3 packages,
>>>>>>    which currently provides Python 3.7.
>>>>>>
>>>>>> which don't guarantee stability over time and might be hard to
>>>>>> synchronize with our support matrix.
>>>>>>
>>>>>> Secondly, omitting libraries which are required for the full
>>>>>> functionality and performance, specifically
>>>>>>
>>>>>>    - Numpy, Pandas and Arrow for PySpark
>>>>>>    - Arrow for SparkR
>>>>>>
>>>>>> is likely to severely limit usability of the images (out of these,
>>>>>> Arrow is probably the hardest to manage, especially when you already 
>>>>>> depend
>>>>>> on system packages to provide R or Python interpreter).
>>>>>>
>>>>>> On 8/14/21 12:43 AM, Mich Talebzadeh wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We can cater for multiple types (spark, spark-py and spark-r) and
>>>>>> spark versions (assuming they are downloaded and available).
>>>>>> The challenge is that these docker images built are snapshots. They
>>>>>> cannot be amended later and if you change anything by going inside 
>>>>>> docker,
>>>>>> as soon as you are logged out whatever you did is reversed.
>>>>>>
>>>>>> For example, I want to add tensorflow to my docker image. These are
>>>>>> my images
>>>>>>
>>>>>> REPOSITORY                                TAG           IMAGE ID
>>>>>>  CREATED         SIZE
>>>>>> eu.gcr.io/axial-glow-224522/spark-py      java8_3.1.1
>>>>>>  cfbb0e69f204   5 days ago      2.37GB
>>>>>> eu.gcr.io/axial-glow-224522/spark         3.1.1
>>>>>>  8d1bf8e7e47d   5 days ago      805MB
>>>>>>
>>>>>> using image ID I try to log in as root to the image
>>>>>>
>>>>>> *docker run -u0 -it cfbb0e69f204 bash*
>>>>>>
>>>>>> root@b542b0f1483d:/opt/spark/work-dir# pip install keras
>>>>>> Collecting keras
>>>>>>   Downloading keras-2.6.0-py2.py3-none-any.whl (1.3 MB)
>>>>>>      |████████████████████████████████| 1.3 MB 1.1 MB/s
>>>>>> Installing collected packages: keras
>>>>>> Successfully installed keras-2.6.0
>>>>>> WARNING: Running pip as the 'root' user can result in broken
>>>>>> permissions and conflicting behaviour with the system package manager. It
>>>>>> is recommended to use a virtual environment instead:
>>>>>> https://pip.pypa.io/warnings/venv
>>>>>> root@b542b0f1483d:/opt/spark/work-dir# pip list
>>>>>> Package       Version
>>>>>> ------------- -------
>>>>>> asn1crypto    0.24.0
>>>>>> cryptography  2.6.1
>>>>>> cx-Oracle     8.2.1
>>>>>> entrypoints   0.3
>>>>>> *keras         2.6.0      <--- it is here*
>>>>>> keyring       17.1.1
>>>>>> keyrings.alt  3.1.1
>>>>>> numpy         1.21.1
>>>>>> pip           21.2.3
>>>>>> py4j          0.10.9
>>>>>> pycrypto      2.6.1
>>>>>> PyGObject     3.30.4
>>>>>> pyspark       3.1.2
>>>>>> pyxdg         0.25
>>>>>> PyYAML        5.4.1
>>>>>> SecretStorage 2.3.1
>>>>>> setuptools    57.4.0
>>>>>> six           1.12.0
>>>>>> wheel         0.32.3
>>>>>> root@b542b0f1483d:/opt/spark/work-dir# exit
>>>>>>
>>>>>> Now I exited from the image and try to log in again
>>>>>> (pyspark_venv) hduser@rhes76: /home/hduser/dba/bin/build> docker run
>>>>>> -u0 -it cfbb0e69f204 bash
>>>>>>
>>>>>> root@5231ee95aa83:/opt/spark/work-dir# pip list
>>>>>> Package       Version
>>>>>> ------------- -------
>>>>>> asn1crypto    0.24.0
>>>>>> cryptography  2.6.1
>>>>>> cx-Oracle     8.2.1
>>>>>> entrypoints   0.3
>>>>>> keyring       17.1.1
>>>>>> keyrings.alt  3.1.1
>>>>>> numpy         1.21.1
>>>>>> pip           21.2.3
>>>>>> py4j          0.10.9
>>>>>> pycrypto      2.6.1
>>>>>> PyGObject     3.30.4
>>>>>> pyspark       3.1.2
>>>>>> pyxdg         0.25
>>>>>> PyYAML        5.4.1
>>>>>> SecretStorage 2.3.1
>>>>>> setuptools    57.4.0
>>>>>> six           1.12.0
>>>>>> wheel         0.32.3
>>>>>>
>>>>>> *Hm that keras is not there*. The docker Image cannot be altered
>>>>>> after build! So once the docker image is created that is just a snapshot.
>>>>>> However, it will still have tons of useful stuff for most
>>>>>> users/organisations. My suggestions is to create for a given type (spark,
>>>>>> spark-py etc):
>>>>>>
>>>>>>
>>>>>>    1. One vanilla flavour for everyday use with few useful packages
>>>>>>    2. One for medium use with most common packages for ETL/ELT stuff
>>>>>>    3. One specialist for ML etc with keras, tensorflow and anything
>>>>>>    else needed
>>>>>>
>>>>>>
>>>>>> These images should be maintained as we currently maintain spark
>>>>>> releases with accompanying documentation. Any reason why we cannot 
>>>>>> maintain
>>>>>> ourselves?
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>>    view my Linkedin profile
>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>> for any loss, damage or destruction of data or any other property which 
>>>>>> may
>>>>>> arise from relying on this email's technical content is explicitly
>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>> arising from such loss, damage or destruction.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, 13 Aug 2021 at 17:26, Holden Karau <hol...@pigscanfly.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> So we actually do have a script that does the build already it's
>>>>>>> more a matter of publishing the results for easier use. Currently the
>>>>>>> script produces three images spark, spark-py, and spark-r. I can 
>>>>>>> certainly
>>>>>>> see a solid reason to publish like with a jdk11 & jdk8 suffix as well if
>>>>>>> there is interest in the community. If we want to have a say
>>>>>>> spark-py-pandas for a Spark container image with everything necessary 
>>>>>>> for
>>>>>>> the Koalas stuff to work then I think that could be a great PR from 
>>>>>>> someone
>>>>>>> to add :)
>>>>>>>
>>>>>>> On Fri, Aug 13, 2021 at 1:00 AM Mich Talebzadeh <
>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>
>>>>>>>> should read PySpark
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>    view my Linkedin profile
>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>>> for any loss, damage or destruction of data or any other property 
>>>>>>>> which may
>>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>>> disclaimed. The author will in no case be liable for any monetary 
>>>>>>>> damages
>>>>>>>> arising from such loss, damage or destruction.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, 13 Aug 2021 at 08:51, Mich Talebzadeh <
>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Agreed.
>>>>>>>>>
>>>>>>>>> I have already built a few latest for Spark and PYSpark on 3.1.1
>>>>>>>>> with Java 8 as I found out Java 11 does not work with Google BigQuery 
>>>>>>>>> data
>>>>>>>>> warehouse. However, to hack the Dockerfile one finds out the hard way.
>>>>>>>>>
>>>>>>>>> For example how to add additional Python libraries like tensorflow
>>>>>>>>> etc. Loading these libraries through Kubernetes is not practical as
>>>>>>>>> unzipping and installing it through --py-files etc will take 
>>>>>>>>> considerable
>>>>>>>>> time so they need to be added to the dockerfile at the built time in
>>>>>>>>> directory for Python under Kubernetes
>>>>>>>>>
>>>>>>>>> /opt/spark/kubernetes/dockerfiles/spark/bindings/python
>>>>>>>>>
>>>>>>>>> RUN pip install pyyaml numpy cx_Oracle tensorflow ....
>>>>>>>>>
>>>>>>>>> Also you will need curl to test the ports from inside the docker
>>>>>>>>>
>>>>>>>>> RUN apt-get update && apt-get install -y curl
>>>>>>>>> RUN ["apt-get","install","-y","vim"]
>>>>>>>>>
>>>>>>>>> As I said I am happy to build these specific dockerfiles plus the
>>>>>>>>> complete documentation for it. I have already built one for Google 
>>>>>>>>> (GCP).
>>>>>>>>> The difference between Spark and PySpark version is that in 
>>>>>>>>> Spark/scala a
>>>>>>>>> fat jar file will contain all needed. That is not the case with 
>>>>>>>>> Python I am
>>>>>>>>> afraid.
>>>>>>>>>
>>>>>>>>> HTH
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    view my Linkedin profile
>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>>>> for any loss, damage or destruction of data or any other property 
>>>>>>>>> which may
>>>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>>>> disclaimed. The author will in no case be liable for any monetary 
>>>>>>>>> damages
>>>>>>>>> arising from such loss, damage or destruction.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, 13 Aug 2021 at 08:13, Bode, Meikel, NMA-CFD <
>>>>>>>>> meikel.b...@bertelsmann.de> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am Meikel Bode and only an interested reader of dev and user
>>>>>>>>>> list. Anyway, I would appreciate to have official docker images 
>>>>>>>>>> available.
>>>>>>>>>>
>>>>>>>>>> Maybe one could get inspiration from the Jupyter docker stacks
>>>>>>>>>> and provide an hierarchy of different images like this:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#image-relationships
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Having a core image only supporting Java, an extended supporting
>>>>>>>>>> Python and/or R etc.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looking forward to the discussion.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Meikel
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *From:* Mich Talebzadeh <mich.talebza...@gmail.com>
>>>>>>>>>> *Sent:* Freitag, 13. August 2021 08:45
>>>>>>>>>> *Cc:* dev <dev@spark.apache.org>
>>>>>>>>>> *Subject:* Re: Time to start publishing Spark Docker Images?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I concur this is a good idea and certainly worth exploring.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> In practice, preparing docker images as deployable will throw
>>>>>>>>>> some challenges because creating docker for Spark  is not really a 
>>>>>>>>>> singular
>>>>>>>>>> modular unit, say  creating docker for Jenkins. It involves different
>>>>>>>>>> versions and different images for Spark and PySpark and most likely 
>>>>>>>>>> will
>>>>>>>>>> end up as part of Kubernetes deployment.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Individuals and organisations will deploy it as the first cut.
>>>>>>>>>> Great but I equally feel that good documentation on how to build a
>>>>>>>>>> consumable deployable image will be more valuable.  FRom my own 
>>>>>>>>>> experience
>>>>>>>>>> the current documentation should be enhanced, for example how to 
>>>>>>>>>> deploy
>>>>>>>>>> working directories, additional Python packages, build with 
>>>>>>>>>> different Java
>>>>>>>>>> versions  (version 8 or version 11) etc.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> HTH
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790679755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0CkL3HZo9FNVUOnLQ4CYs29Z9HfrwE4xDqLgVmMbr10%3D&reserved=0>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all
>>>>>>>>>> responsibility for any loss, damage or destruction of data or any 
>>>>>>>>>> other
>>>>>>>>>> property which may arise from relying on this email's technical 
>>>>>>>>>> content is
>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any
>>>>>>>>>> monetary damages arising from such loss, damage or destruction.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, 13 Aug 2021 at 01:54, Holden Karau <hol...@pigscanfly.ca>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Awesome, I've filed an INFRA ticket to get the ball rolling.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 12, 2021 at 5:48 PM John Zhuge <jzh...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 12, 2021 at 5:44 PM Hyukjin Kwon <gurwls...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> +1, I think we generally agreed upon having it. Thanks Holden for
>>>>>>>>>> headsup and driving this.
>>>>>>>>>>
>>>>>>>>>> +@Dongjoon Hyun <dongj...@apache.org> FYI
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2021년 7월 22일 (목) 오후 12:22, Kent Yao <yaooq...@gmail.com>님이 작성:
>>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Bests,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Kent Yao*
>>>>>>>>>>
>>>>>>>>>> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
>>>>>>>>>>
>>>>>>>>>> *a spark* *enthusiast*
>>>>>>>>>>
>>>>>>>>>> *kyuubi
>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyaooqinn%2Fkyuubi&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790679755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZkE%2BAK4%2BUO9JsDzZlAfY5gsATCVm5hidLCp7EGxAWiY%3D&reserved=0>**is
>>>>>>>>>> a unified* *multi-tenant* *JDBC interface for large-scale data
>>>>>>>>>> processing and analytics,* *built on top of* *Apache Spark
>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspark.apache.org%2F&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790689711%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4YYZ61B6datdx2GsxqnEUOpYuJUn35egYRQSVnUxtF0%3D&reserved=0>*
>>>>>>>>>> *.*
>>>>>>>>>> *spark-authorizer
>>>>>>>>>> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyaooqinn%2Fspark-authorizer&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cd97d97be540246aa975308d95e260c99%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637644339790689711%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=P6TMaSh7UeXVyv79RiRqdBpipaIjh2o3DhRs0GGhWF4%3D&reserved=0>**A
>>>>>>>>>> Spark SQL extension which provides SQL Standard Authorization for*
>>>>>>>>>>
>>>>>>>>>> --
It's dark in this basement.

Re: Time to start publishing Spark Docker Images?

Reply via email to