I'm new to Spark and would like to seek some advice on how to approach a
problem.
I have a large dataset that has dated observations. There are also columns that
are running sums of some of other columns.
date | thing | foo | bar | foo_sum | bar_sum |
+===+===
I have faced the same problem, where hive and spark orc were using the
snappy compression.
Hive 2.1
Spark 2.4.8
I'm curious to learn what could be the root cause of this.
-S
On Tue, Oct 11, 2022, 2:18 AM Chartist <13289341...@163.com> wrote:
>
> Hi,All
>
> I encountered a problem as the e-mai
See the pom.xml file
https://github.com/apache/spark/blob/master/pom.xml#L3590
2.13.8 at the moment; IIRC there was some Scala issue that prevented
updating to 2.13.9. Search issues/PRs.
On Tue, Oct 11, 2022 at 6:11 PM Henrik Park wrote:
> scala 2.13.9 was released. do you know which spark versi
scala 2.13.9 was released. do you know which spark version would have it
built-in?
thanks
Sean Owen wrote:
I would imagine that Scala 2.12 support goes away, and Scala 3 support
is added, for maybe Spark 4.0, and maybe that happens in a year or so.
--
Simple Mail
https://simplemail.co.in/
-
For Spark, the issue is maintaining simultaneous support for multiple Scala
versions, which has historically been mutually incompatible across minor
versions.
Until Scala 2.12 support is reasonable to remove, it's hard to also support
Scala 3, as it would mean maintaining three versions of code.
I
No one knows for sure except Apache, but I’d learn Scala 2 if I were you. Even
if Spark one day migrates to Scala 3 (which is not given), it’ll take a while
for the industry to adjust. It even takes a while to move from Spark 2 to Spark
3 (Scala 2.11 to Scala 2.12). I don’t think your knowledge
Hi,All
I encountered a problem as the e-mail subject described. And the followings
are the details:
SQL:
insert overwrite table mytable partition(pt='20220518')
select guid, user_new_id, sum_credit_score, sum_credit_score_change,
platform_credit_score_change, bike_credit_score_change,
e