Re: Spark 3.2.4 pom NOT FOUND on maven

2023-04-17 Thread Enrico Minack
Any suggestions on how to fix or use the Spark 3.2.4 (Scala 2.13) release? Cheers, Enrico Am 17.04.23 um 08:19 schrieb Enrico Minack: Hi, thanks for the Spark 3.2.4 release. I have found that Maven does not serve the spark-parent_2.13 pom file. It is listed in the directory:

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Ankit Gupta
Thanks Elliot ! Let me check it out ! On Mon, 17 Apr, 2023, 10:08 pm Elliot West, wrote: > Hi Ankit, > > While not a part of Spark, there is a project called 'WaggleDance' that > can federate multiple Hive metastores so that they are accessible via a > single URI:

Re: Parametrisable output metadata path

2023-04-17 Thread Jungtaek Lim
small correction: "I intentionally didn't enumerate." The meaning could be quite different so making a small correction. On Tue, Apr 18, 2023 at 5:38 AM Jungtaek Lim wrote: > There seems to be miscommunication - I didn't mean "Delta Lake". I meant > "any" Data Lake products. Since I'm biased I

Re: Parametrisable output metadata path

2023-04-17 Thread Jungtaek Lim
There seems to be miscommunication - I didn't mean "Delta Lake". I meant "any" Data Lake products. Since I'm biased I didn't intentionally enumerate actual products, but there are "Apache Hudi", "Apache Iceberg", etc as well. We made non-trivial numbers of band-aid fixes already for file stream

Re: [ANNOUNCE] Apache Spark 3.4.0 released

2023-04-17 Thread Xinrong Meng
Thank you, Dongjoon! On Sat, Apr 15, 2023 at 9:04 AM Dongjoon Hyun wrote: > Nice catch, Xiao! > > All `latest` tags are updated to v3.4.0 now. > > https://hub.docker.com/r/apache/spark/tags > https://hub.docker.com/r/apache/spark-py/tags > https://hub.docker.com/r/apache/spark-r/tags > >

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Cheng Pan
There is a DSv2-based Hive connector in Apache Kyuubi[1] that supports connecting multiple HMS in a single Spark application. Some limitations - currently only supports Spark 3.3 - has a known issue when using w/ `spark-sql`, but OK w/ spark-shell and normal jar-based Spark application. [1]

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Elliot West
Hi Ankit, While not a part of Spark, there is a project called 'WaggleDance' that can federate multiple Hive metastores so that they are accessible via a single URI: https://github.com/ExpediaGroup/waggle-dance This may be useful or perhaps serve as inspiration. Thanks, Elliot. On Mon, 17 Apr

Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Ankit Gupta
++ User Mailing List Just a reminder, anyone who can help on this. Thanks a lot ! Ankit Prakash Gupta On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta wrote: > Hi All > > The question is regarding the support of multiple Remote Hive Metastore > catalogs with Spark. Starting Spark 3, multiple

Spark 3.2.4 pom NOT FOUND on maven

2023-04-17 Thread Enrico Minack
Hi, thanks for the Spark 3.2.4 release. I have found that Maven does not serve the spark-parent_2.13 pom file. It is listed in the directory: https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/ But cannot be downloaded:

Re: Parametrisable output metadata path

2023-04-17 Thread Wojciech Indyk
Hi Jungtaek, integration with Delta Lake is not an option to me, I raised a PR for improvement of FileStreamSink with the new parameter: https://github.com/apache/spark/pull/40821. Can you please take a look? -- Kind regards/ Pozdrawiam, Wojciech Indyk niedz., 16 kwi 2023 o 04:45 Jungtaek Lim

The Spark email setting should be update

2023-04-17 Thread Jia Fan
Hi, everyone. I find that every time I reply to dev's mailing list, the default address of the reply is the sender of the mail, not dev@spark.apache.org. It caused me to think that the email reply to dev was successful several times, but it wasn't. This should not be a common problem, because