Unsubscribe

2023-07-17 Thread Bode, Meikel
Unsubscribe

Unsubscribe

2023-07-16 Thread Bode, Meikel
Unsubscribe

RE: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-21 Thread Bode, Meikel, NM-X-DS
el.b...@bertelsmann.de<mailto:meikel.b...@bertelsmann.de> arvato-systems.de<https://www.arvato-systems.de/> From: Juan Liu Sent: Donnerstag, 20. Januar 2022 09:44 To: Bode, Meikel, NM-X-DS Cc: sro...@gmail.com; Theodore J Griesenbrock ; user@spark.apache.org Subject: RE: Does Spark 3.1.2/3

RE: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-19 Thread Bode, Meikel, NM-X-DS
Hi, New releases are announced via mailing lists user@spark.apache.org & d...@spark.apache.org. Best, Meikel From: Theodore J Griesenbrock Sent: Mittwoch, 19. Januar 2022 18:50 To: sro...@gmail.com Cc: Juan Liu ;

RE: Conda Python Env in K8S

2021-12-06 Thread Bode, Meikel, NMA-CFD
615 Best, Meikel From: Mich Talebzadeh Sent: Samstag, 4. Dezember 2021 18:36 To: Bode, Meikel, NMA-CFD Cc: dev ; user@spark.apache.org Subject: Re: Conda Python Env in K8S Hi Meikel In the past I tried with --py-files hdfs://$HDFS_HOST:$HDFS_PORT/minikube/codes/DSBQ.

RE: Conda Python Env in K8S

2021-12-03 Thread Bode, Meikel, NMA-CFD
these options exist and I want to understand what the issue is... Any hints on that? Best, Meikel From: Mich Talebzadeh Sent: Freitag, 3. Dezember 2021 13:27 To: Bode, Meikel, NMA-CFD Cc: dev ; user@spark.apache.org Subject: Re: Conda Python Env in K8S Build python packages into the docker

Conda Python Env in K8S

2021-12-03 Thread Bode, Meikel, NMA-CFD
Hello, I am trying to run spark jobs using Spark Kubernetes Operator. But when I try to bundle a conda python environment using the following resource description the python interpreter is only unpack to the driver and not to the executors. apiVersion: "sparkoperator.k8s.io/v1beta2" kind:

RE: [issue] not able to add external libs to pyspark job while using spark-submit

2021-11-24 Thread Bode, Meikel, NMA-CFD
Can we add Python dependencies as we can do for mvn coordinates? So that we run sth like pip install or download from pypi index? From: Mich Talebzadeh Sent: Mittwoch, 24. November 2021 18:28 Cc: user@spark.apache.org Subject: Re: [issue] not able to add external libs to pyspark job while

RE: HiveThrift2 ACID Transactions?

2021-11-11 Thread Bode, Meikel, NMA-CFD
be very appreciated! Many thanks, Meikel Bode From: Bode, Meikel, NMA-CFD Sent: Mittwoch, 10. November 2021 08:23 To: user ; dev Subject: HiveThrift2 ACID Transactions? Hi all, We want to use apply INSERTS, UPDATE, and DELETE operations on tables based on parquet or ORC files served by thrift2

HiveThrift2 ACID Transactions?

2021-11-09 Thread Bode, Meikel, NMA-CFD
Hi all, We want to use apply INSERTS, UPDATE, and DELETE operations on tables based on parquet or ORC files served by thrift2. Actually its unclear whether we can enable them and where. At the moment, when executing UPDATE or DELETE operations those are getting blocked. Anyone out who uses

3.1.2 Executor Initialization fails due to dep copy failure

2021-11-08 Thread Bode, Meikel, NMA-CFD
Hi all, I try to get Thrift2 on Spark 3.1.2 running on K8S with one executor for the moment. This works so far but it fails at executor side during initialization. The issue seems to be related to access restrictions on certain directories... But I am not sure. Please see errors marked in

RE: [ANNOUNCE] Apache Spark 3.2.0

2021-10-19 Thread Bode, Meikel, NMA-CFD
Many thanks!  From: Gengliang Wang Sent: Dienstag, 19. Oktober 2021 16:16 To: dev ; user Subject: [ANNOUNCE] Apache Spark 3.2.0 Hi all, Apache Spark 3.2.0 is the third release of the 3.x line. With tremendous contribution from the open-source community, this release managed to resolve in

RE: spark thrift server as hive on spark running on kubernetes, and more.

2021-09-10 Thread Bode, Meikel, NMA-CFD
Hi, thx. Great work. Will test it  Best, Meikel Bode From: Kidong Lee Sent: Freitag, 10. September 2021 01:39 To: user@spark.apache.org Subject: spark thrift server as hive on spark running on kubernetes, and more. Hi, Recently, I have open-sourced a tool called

RE: K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD
On EKS... From: Mich Talebzadeh Sent: Donnerstag, 12. August 2021 15:47 To: Bode, Meikel, NMA-CFD Cc: user@spark.apache.org Subject: Re: K8S submit client vs. cluster Ok As I see it with PySpark even if it is submitted as cluster, it will be converted to client mode anyway Are you running

RE: K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD
Hi Mich, All PySpark. Best, Meikel From: Mich Talebzadeh Sent: Donnerstag, 12. August 2021 13:41 To: Bode, Meikel, NMA-CFD Cc: user@spark.apache.org Subject: Re: K8S submit client vs. cluster Is this Spark or PySpark? [https://docs.google.com/uc?export=download=1

K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD
Hi all, If we schedule a spark job on k8s, how are volume mappings handled? In client mode I would expect that drivers volumes have to mapped manually in the pod template. Executor volumes are attached dynamically based on submit parameters. Right...? I cluster mode I would expect that

Parquet Metadata

2021-06-23 Thread Bode, Meikel, NMA-CFD
Hi folks, Maybe not the right audience but maybe you came along such an requirement. Is it possible to define a parquet schema, that contains technical column names and a list of translations for a certain column name into different languages? I give an example: Technical: "custnr" would

DF blank value fill

2021-05-21 Thread Bode, Meikel, NMA-CFD
Hi all, My df looks like follows: Situation: MainKey, SubKey, Val1, Val2, Val3, ... 1, 2, a, null, c 1, 2, null, null, c 1, 3, null, b, null 1, 3, a, null, c Desired outcome: 1, 2, a, b, c 1, 2, a, b, c 1, 3, a, b, c 1, 3, a, b, c How could I populate/synchronize empty cells of all records

RE: Thrift2 Server on Kubernetes?

2021-05-16 Thread Bode, Meikel, NMA-CFD
Hi Kidong Lee, Thank you for your email. Actually I came along your blog and it seems to be very complete. As you write, that its not easy to bring Spark Thrift2 to K8S and because you had to write your own wrapper, I have the impression that is not really officially supported, despite the

Thrift2 Server on Kubernetes?

2021-05-14 Thread Bode, Meikel, NMA-CFD
Hi all, We migrate to k8s and I wonder whether there are already "good practices" to run thrift2 on k8s? Best, Meikel

Broadcast Variable

2021-05-03 Thread Bode, Meikel, NMA-CFD
Hi all, when broadcasting a large dict containing several million entries to executors what exactly happens when calling bc_var.value within a UDF like: .. d = bc_var.value .. Does d receives a copy of the dict inside value or is this handled like a pointer? Thanks, Meikel

RE: Recursive Queries or Recursive UDF?

2021-05-01 Thread Bode, Meikel, NMA-CFD
g: child, lvl-0-parent inquiry1, null inquiry2, null order3, null Actually that’s what I realized with my recursive UDF I put into the initial post. Thank you for any hints on that issue! Any hints on the UDF solution are also very welcome: Thx and best, Meikel From: Bode, Meikel, NMA-CFD

Recursive Queries or Recursive UDF?

2021-04-30 Thread Bode, Meikel, NMA-CFD
Hi all, I implemented a recursive UDF, that tries to find a document number in a long list of predecessor documents. This can be a multi-level hierarchy: C is successor of B is successor of A (but many more levels are possible) As input to that UDF I prepare a dict that contains the complete

AW: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Bode, Meikel, NMA-CFD
Congrats! Von: Hyukjin Kwon Gesendet: Mittwoch, 3. März 2021 02:41 An: user @spark ; dev Betreff: [ANNOUNCE] Announcing Apache Spark 3.1.1 We are excited to announce Spark 3.1.1 today. Apache Spark 3.1.1 is the second release of the 3.x line. This release adds Python type annotations and

AW: Issue after change to 3.0.2

2021-02-26 Thread Bode, Meikel, NMA-CFD
Hi Sean. You are right. So we are using docker images for our spark cluster. The generation of the worker image did no succeed and therefore the old 3.0.1 image was still in use. Thanks, Best, Meikel Von: Sean Owen Gesendet: Freitag, 26. Februar 2021 10:29 An: Bode, Meikel, NMA-CFD Cc: user

Issue after change to 3.0.2

2021-02-26 Thread Bode, Meikel, NMA-CFD
Hi All, After changing to 3.0.2 I face the following issue. Thanks for any hint on that issue. Best, Meikel df = self.spark.read.json(path_in) File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 300, in json File

Strange behavior with "bigger" JSON file

2021-01-28 Thread Bode, Meikel, NMA-CFD
Hi all, I process a lot of JSON files of different sizes. All files share the same overall structure. I have no issue with files of sizes around 150-300MB. Another file of around 530MB now causes errors when I apply selectExpr on the resulting DF after reading the file. AnalysisException: