Re: [VOTE] Release Spark 2.4.6 (RC8)

Xiao Li Wed, 03 Jun 2020 10:04:10 -0700

Just downloaded it in my local macbook. Trying to create a table using the
pre-built PySpark. It sounds like the conf "spark.sql.warehouse.dir"
does not take an effect. It is trying to create a directory in
"file:/user/hive/warehouse/t1". I have not done any investigation yet. Have
any of you hit the same issue?


C02XT0U7JGH5:bin lixiao$ ./pyspark --conf
spark.sql.warehouse.dir="/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6"

Python 2.7.16 (default, Jan 27 2020, 04:46:15)

[GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

20/06/03 09:56:11 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties

Setting default log level to "WARN".

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).

Welcome to

      ____              __

     / __/__  ___ _____/ /__

    _\ \/ _ \/ _ `/ __/  '_/

   /__ / .__/\_,_/_/ /_/\_\   version 2.4.6

      /_/


Using Python version 2.7.16 (default, Jan 27 2020 04:46:15)

SparkSession available as 'spark'.

>>> spark.sql("set spark.sql.warehouse.dir").show(truncate=False)

+-----------------------+-------------------------------------------------+

|key                    |value                                            |

+-----------------------+-------------------------------------------------+

|spark.sql.warehouse.dir|/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6|

+-----------------------+-------------------------------------------------+


>>> spark.sql("create table t1 (col1 int)")

20/06/03 09:56:29 WARN HiveMetaStore: Location:
file:/user/hive/warehouse/t1 specified for non-external table:t1

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File
"/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/pyspark/sql/session.py",
line 767, in sql

    return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)

  File
"/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1257, in __call__

  File
"/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/pyspark/sql/utils.py",
line 69, in deco

    raise AnalysisException(s.split(': ', 1)[1], stackTrace)

pyspark.sql.utils.AnalysisException:
u'org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:file:/user/hive/warehouse/t1 is not a directory or
unable to create one);'

Dongjoon Hyun <dongjoon.h...@gmail.com> 于2020年6月3日周三 上午9:18写道：

> +1
>
> Bests,
> Dongjoon
>
> On Wed, Jun 3, 2020 at 5:59 AM Tom Graves <tgraves...@yahoo.com.invalid>
> wrote:
>
>>  +1
>>
>> Tom
>>
>> On Sunday, May 31, 2020, 06:47:09 PM CDT, Holden Karau <
>> hol...@pigscanfly.ca> wrote:
>>
>>
>> Please vote on releasing the following candidate as Apache Spark
>> version 2.4.6.
>>
>> The vote is open until June 5th at 9AM PST and passes if a majority +1
>> PMC votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 2.4.6
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> There are currently no issues targeting 2.4.6 (try project = SPARK AND
>> "Target Version/s" = "2.4.6" AND status in (Open, Reopened, "In Progress"))
>>
>> The tag to be voted on is v2.4.6-rc8 (commit
>> 807e0a484d1de767d1f02bd8a622da6450bdf940):
>> https://github.com/apache/spark/tree/v2.4.6-rc8
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.4.6-rc8-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1349/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.4.6-rc8-docs/
>>
>> The list of bug fixes going into 2.4.6 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12346781
>>
>> This release is using the release script of the tag v2.4.6-rc8.
>>
>> FAQ
>>
>> =========================
>> What happened to the other RCs?
>> =========================
>>
>> The parallel maven build caused some flakiness so I wasn't comfortable
>> releasing them. I backported the fix from the 3.0 branch for this release.
>> I've got a proposed change to the build script so that we only push tags
>> when once the build is a success for the future, but it does not block this
>> release.
>>
>> =========================
>> How can I help test this release?
>> =========================
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with an out of date RC going forward).
>>
>> ===========================================
>> What should happen to JIRA tickets still targeting 2.4.6?
>> ===========================================
>>
>> The current list of open tickets targeted at 2.4.6 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 2.4.6
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==================
>> But my bug isn't fixed?
>> ==================
>>
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

Re: [VOTE] Release Spark 2.4.6 (RC8)

Reply via email to