Re: Do we need to finally update Guava?

2019-12-16 Thread Sean Owen
PS you are correct; with Guava 27 and my recent changes, and Hadoop
3.2.1 + Hive 2.3, I still see ...

*** RUN ABORTED ***
  java.lang.IllegalAccessError: tried to access method
com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator;
from class org.apache.hadoop.hive.ql.exec.FetchOperator
  at org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108)
  at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
...

So, hm, we can make Spark work with old and new Guava (see PR) but
this seems like it will cause a problem updating to / running on
Hadoop 3.2.1+, regardless.

On Mon, Dec 16, 2019 at 11:36 AM Marcelo Vanzin  wrote:
>
> Great that Hadoop has done it (which, btw, probably means that Spark
> won't work with that version of Hadoop yet), but Hive also depends on
> Guava, and last time I tried, even Hive 3.x did not work with Guava
> 27.
>
> (Newer Hadoop versions also have a new artifact that shades a lot of
> dependencies, which would be great for Spark. But since Spark uses
> some test artifacts from Hadoop, that may be a bit tricky, since I
> don't believe those are shaded.)
>
> On Sun, Dec 15, 2019 at 8:08 AM Sean Owen  wrote:
> >
> > See for example:
> >
> > https://github.com/apache/spark/pull/25932#issuecomment-565822573
> > https://issues.apache.org/jira/browse/SPARK-23897
> >
> > This is a dicey dependency that we have been reluctant to update as a)
> > Hadoop used an old version and b) Guava versions are incompatible
> > after a few releases.
> >
> > But Hadoop is going all the way from 11 to 27 in Hadoop 3.2.1. Time to
> > match that? I haven't assessed how much internal change it requires.
> > If it's a lot, well, that makes it hard, as we need to stay compatible
> > with Hadoop 2 / Guava 11-14. But then that causes a problem updating
> > past Hadoop 3.2.0.
> >
> > -
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
>
>
> --
> Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Do we need to finally update Guava?

2019-12-16 Thread Sean Owen
Yeah that won't be the last problem I bet. Here's a proposal for just
directly reducing exposure to Guava in Spark itself though:
https://github.com/apache/spark/pull/26911

On Mon, Dec 16, 2019 at 11:36 AM Marcelo Vanzin  wrote:
>
> Great that Hadoop has done it (which, btw, probably means that Spark
> won't work with that version of Hadoop yet), but Hive also depends on
> Guava, and last time I tried, even Hive 3.x did not work with Guava
> 27.
>
> (Newer Hadoop versions also have a new artifact that shades a lot of
> dependencies, which would be great for Spark. But since Spark uses
> some test artifacts from Hadoop, that may be a bit tricky, since I
> don't believe those are shaded.)
>
> On Sun, Dec 15, 2019 at 8:08 AM Sean Owen  wrote:
> >
> > See for example:
> >
> > https://github.com/apache/spark/pull/25932#issuecomment-565822573
> > https://issues.apache.org/jira/browse/SPARK-23897
> >
> > This is a dicey dependency that we have been reluctant to update as a)
> > Hadoop used an old version and b) Guava versions are incompatible
> > after a few releases.
> >
> > But Hadoop is going all the way from 11 to 27 in Hadoop 3.2.1. Time to
> > match that? I haven't assessed how much internal change it requires.
> > If it's a lot, well, that makes it hard, as we need to stay compatible
> > with Hadoop 2 / Guava 11-14. But then that causes a problem updating
> > past Hadoop 3.2.0.
> >
> > -
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
>
>
> --
> Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Do we need to finally update Guava?

2019-12-16 Thread Marcelo Vanzin
Great that Hadoop has done it (which, btw, probably means that Spark
won't work with that version of Hadoop yet), but Hive also depends on
Guava, and last time I tried, even Hive 3.x did not work with Guava
27.

(Newer Hadoop versions also have a new artifact that shades a lot of
dependencies, which would be great for Spark. But since Spark uses
some test artifacts from Hadoop, that may be a bit tricky, since I
don't believe those are shaded.)

On Sun, Dec 15, 2019 at 8:08 AM Sean Owen  wrote:
>
> See for example:
>
> https://github.com/apache/spark/pull/25932#issuecomment-565822573
> https://issues.apache.org/jira/browse/SPARK-23897
>
> This is a dicey dependency that we have been reluctant to update as a)
> Hadoop used an old version and b) Guava versions are incompatible
> after a few releases.
>
> But Hadoop is going all the way from 11 to 27 in Hadoop 3.2.1. Time to
> match that? I haven't assessed how much internal change it requires.
> If it's a lot, well, that makes it hard, as we need to stay compatible
> with Hadoop 2 / Guava 11-14. But then that causes a problem updating
> past Hadoop 3.2.0.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>


-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Do we need to finally update Guava?

2019-12-15 Thread Sean Owen
See for example:

https://github.com/apache/spark/pull/25932#issuecomment-565822573
https://issues.apache.org/jira/browse/SPARK-23897

This is a dicey dependency that we have been reluctant to update as a)
Hadoop used an old version and b) Guava versions are incompatible
after a few releases.

But Hadoop is going all the way from 11 to 27 in Hadoop 3.2.1. Time to
match that? I haven't assessed how much internal change it requires.
If it's a lot, well, that makes it hard, as we need to stay compatible
with Hadoop 2 / Guava 11-14. But then that causes a problem updating
past Hadoop 3.2.0.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org