off the record: Really irritates me too, as it forces me to do local builds even though I shouldn't have to. Sometimes I do that for other reasons, but still.
Getting the cloud-storage module in was hard enough at the time that I wasn't going to push harder; I essentially stopped trying to get one in to spark after that and effectively being told to go and play in my own fork (*). https://github.com/apache/spark/pull/12004#issuecomment-259020494 Given that effort almost failed, to then say "now include the artifact and releases" wasn't something I was going to do; I had everything I needed for my own build, and trying to add new PRs struck me as an exercise in confrontation and futility Sean, if I do submit a PR which makes hadoop-cloud default on the right versions, but strips out the dependencies on the final tarball, would that get some attention? (*) Sean of course, was a notable exception and very supportive. On Wed, 2 Jun 2021 at 00:56, Stephen Coy <s...@infomedia.com.au> wrote: > I have been building Apache Spark from source just so I can get this > dependency. > > > 1. git checkout v3.1.1 > 2. dev/make-distribution.sh --name hadoop-cloud-3.2 --tgz -Pyarn > -Phadoop-3.2 -Pyarn -Phadoop-cloud > -Phive-thriftserver -Dhadoop.version=3.2.0 > > > It is kind of a nuisance having to do this though. > > Steve C > > > On 31 May 2021, at 10:34 pm, Sean Owen <sro...@gmail.com> wrote: > > I know it's not enabled by default when the binary artifacts are built, > but not exactly sure why it's not built separately at all. It's almost a > dependencies-only pom artifact, but there are two source files. Steve do > you have an angle on that? > > On Mon, May 31, 2021 at 5:37 AM Erik Torres <etserr...@gmail.com> wrote: > >> Hi, >> >> I'm following this documentation >> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fcloud-integration.html%23installation&data=04%7C01%7Cscoy%40infomedia.com.au%7C48cf9fe9843c4098c1b108d924308527%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637580613083245927%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=FxcJw%2Fw31BJrKmF7U8flqan9nC%2BP8NbiGzVKi5wghog%3D&reserved=0> >> to >> configure my Spark-based application to interact with Amazon S3. However, I >> cannot find the spark-hadoop-cloud module in Maven central for the >> non-commercial distribution of Apache Spark. From the documentation I would >> expect that I can get this module as a Maven dependency in my project. >> However, I ended up building the spark-hadoop-cloud module from the Spark's >> code >> <https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fspark&data=04%7C01%7Cscoy%40infomedia.com.au%7C48cf9fe9843c4098c1b108d924308527%7C45d5407150f849caa59f9457123dc71c%7C0%7C1%7C637580613083255922%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=L9rv29WmZCRAaRtBjVRiM9MXVkvGjeFG%2BaIgOhGeSh8%3D&reserved=0> >> . >> >> Is this the expected way to setup the integration with Amazon S3? I think >> I'm missing something here. >> >> Thanks in advance! >> >> Erik >> > > This email contains confidential information of and is the copyright of > Infomedia. It must not be forwarded, amended or disclosed without consent > of the sender. If you received this message by mistake, please advise the > sender and delete all copies. Security of transmission on the internet > cannot be guaranteed, could be infected, intercepted, or corrupted and you > should ensure you have suitable antivirus protection in place. By sending > us your or any third party personal details, you consent to (or confirm you > have obtained consent from such third parties) to Infomedia’s privacy > policy. http://www.infomedia.com.au/privacy-policy/ >