Hi Wim,

This is an issue DEV/OPS face all the time. Cannot access the internet
behind the company firewall. There is Nexus
<https://www.sonatype.com/nexus/repository-pro> for this that manages
dependencies with usual load times in seconds. However, only authorised
accounts can request it through a service account. I concur it is messy.

cheers,


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 21 Oct 2020 at 06:34, Wim Van Leuven <wim.vanleu...@highestpoint.biz>
wrote:

> Sean,
>
> Problem with the -packages is that in enterprise settings security might
> not allow the data environment to link to the internet or even the internal
> proxying artefect repository.
>
> Also, wasn't uberjars an antipattern? For some reason I don't like them...
>
> Kind regards
> -wim
>
>
>
> On Wed, 21 Oct 2020 at 01:06, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Thanks again all.
>>
>> Anyway as Nicola suggested I used the trench war approach to sort this
>> out by just using jars and working out their dependencies in ~/.ivy2/jars
>> directory using grep -lRi <missing> :)
>>
>>
>> This now works with just using jars (new added ones in grey) after
>> resolving the dependencies
>>
>>
>> ${SPARK_HOME}/bin/spark-submit \
>>
>>                 --master yarn \
>>
>>                 --deploy-mode client \
>>
>>                 --conf spark.executor.memoryOverhead=3000 \
>>
>>                 --class org.apache.spark.repl.Main \
>>
>>                 --name "my own Spark shell on Yarn" "$@" \
>>
>>                 --driver-class-path /home/hduser/jars/ddhybrid.jar \
>>
>>                 --jars /home/hduser/jars/spark-bigquery-latest.jar, \
>>
>>                        /home/hduser/jars/ddhybrid.jar, \
>>
>>
>>  /home/hduser/jars/com.google.http-client_google-http-client-1.24.1.jar, \
>>
>>
>>  
>> /home/hduser/jars/com.google.http-client_google-http-client-jackson2-1.24.1.jar,
>> \
>>
>>
>>  /home/hduser/jars/com.google.cloud.bigdataoss_util-1.9.4.jar, \
>>
>>
>>  /home/hduser/jars/com.google.api-client_google-api-client-1.24.1.jar, \
>>
>>
>> /home/hduser/jars/com.google.oauth-client_google-oauth-client-1.24.1.jar, \
>>
>>
>>  
>> /home/hduser/jars/com.google.apis_google-api-services-bigquery-v2-rev398-1.24.1.jar,
>> \
>>
>>
>>  
>> /home/hduser/jars/com.google.cloud.bigdataoss_bigquery-connector-0.13.4-hadoop2.jar,
>> \
>>
>>                        /home/hduser/jars/spark-bigquery_2.11-0.2.6.jar \
>>
>>
>> Compared to using the package itself as before
>>
>>
>> ${SPARK_HOME}/bin/spark-submit \
>>
>>                 --master yarn \
>>
>>                 --deploy-mode client \
>>
>>                 --conf spark.executor.memoryOverhead=3000 \
>>
>>                 --class org.apache.spark.repl.Main \
>>
>>                 --name "my own Spark shell on Yarn" "$@" \
>>
>>                 --driver-class-path /home/hduser/jars/ddhybrid.jar \
>>
>>                 --jars /home/hduser/jars/spark-bigquery-latest.jar, \
>>
>>                        /home/hduser/jars/ddhybrid.jar \
>>
>>
>>                 --packages com.github.samelamin:spark-bigquery_2.11:0.2.6
>>
>>
>>
>> I think as Sean suggested this approach may or may not work (a manual
>> process) and if jars change, the whole thing has to be re-evaluated adding
>> to the complexity.
>>
>>
>> Cheers
>>
>>
>> On Tue, 20 Oct 2020 at 23:01, Sean Owen <sro...@gmail.com> wrote:
>>
>>> Rather, let --packages (via Ivy) worry about them, because they tell Ivy
>>> what they need.
>>> There's no 100% guarantee that conflicting dependencies are resolved in
>>> a way that works in every single case, which you run into sometimes when
>>> using incompatible libraries, but yes this is the point of --packages and
>>> Ivy.
>>>
>>> On Tue, Oct 20, 2020 at 4:43 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> Thanks again all.
>>>>
>>>> Hi Sean,
>>>>
>>>> As I understood from your statement, you are suggesting just use
>>>> --packages without worrying about individual jar dependencies?
>>>>
>>>>>
>>>>>>>

Reply via email to