I'm afraid that will keep people away from contributing to this test suite,
as they need to download spark with different versions to create the
testing tables...
On Sat, Sep 16, 2017 at 4:48 AM, Shixiong(Ryan) Zhu wrote:
> Can we just create those tables once locally
Can we just create those tables once locally using official Spark versions
and commit them? Then the unit tests can just read these files and don't
need to download Spark.
On Thu, Sep 14, 2017 at 8:13 AM, Sean Owen wrote:
> I think the download could use the Apache mirror,
I think the download could use the Apache mirror, yeah. I don't know if
there's a reason that it must though. What's good enough for releases is
good enough for this purpose. People might not like the big download in the
tests if it really came up as an issue we could find ways to cache it
better
The problem is that it's not really an "official" download link, but rather
just a supplemental convenience. While that may be ok when distributing
artifacts, it's more of a problem when actually building and testing
artifacts. In the latter case, the download should really only be from an
Apache
That test case is trying to test the backward compatibility of
`HiveExternalCatalog`. It downloads official Spark releases and creates
tables with them, and then read these tables via the current Spark.
About the download link, I just picked it from the Spark website, and this
link is the default
Mark, I agree with your point on the risks of using Cloudfront while
building Spark. I was only trying to provide background on when we
started using Cloudfront.
Personally, I don't have enough about context about the test case in
question (e.g. Why are we downloading Spark in a test case ?).
Yeah, but that discussion and use case is a bit different -- providing a
different route to download the final released and approved artifacts that
were built using only acceptable artifacts and sources vs. building and
checking prior to release using something that is not from an Apache
mirror.
Ah right yeah I know it's an S3 bucket. Thanks for the context. Although I
imagine the reasons it was set up no longer apply so much (you can get a
direct mirror download link), and so it would probably be possible to
retire this, there's also no big rush to. I wasn't clear from the thread
whether
The bucket comes from Cloudfront, a CDN thats part of AWS. There was a
bunch of discussion about this back in 2013
https://lists.apache.org/thread.html/9a72ff7ce913dd85a6b112b1b2de536dcda74b28b050f70646aba0ac@1380147885@%3Cdev.spark.apache.org%3E
Shivaram
On Wed, Sep 13, 2017 at 9:30 AM, Sean
Not a big deal, but Mark noticed that this test now downloads Spark
artifacts from the same 'direct download' link available on the downloads
page:
https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala#L53
10 matches
Mail list logo