> the idea is to fix the issues they bump into - because people who load the jdbc driver may also see those issues.
I don’t get what you mean here, could you elaborate a bit more? IMO it's a bit premature to do this without a working hive-exec jar for downstream projects like Spark/Trino/Presto. At the current state there is no way to upgrade these projects to use the fat hive-exec jar. On Wed, Nov 17, 2021 at 5:47 AM Zoltan Haindrich <k...@rxd.hu> wrote: > Hey all, > > I wanted to get back to this - but had other things going on. > > Chao> it is still being used today by some other popular projects > the idea is to fix the issues they bump into - because people who load the > jdbc driver may also see those issues. > > Edward> [...] You all must like enjoy shading jars. > I totally agree that they may use a shell action as well. > I wonder how do you propose to solve issues related to clients using a > different version of the guava library? > > The changes which will remove the core artifact stuff is ready: > https://github.com/apache/hive/pull/2648 > > cheers, > Zoltan > > On 9/21/21 8:23 PM, Edward Capriolo wrote: > > recommendation from the Hive team is to use the hive-exec.jar artifact. > > > > You know about 10 years ago. I mentioned that oozie should just use > > hive-service or hive jdbc. After a big fight where folks kept bringing up > > concurrency bugs in hive-server-1 my prs were rejected (even though hive > > server2 would not have these bugs). I still cannot fathom why someone > using > > oozie would want a fat jar of hive (as opposed to hive server or > hivejdbc) > > . If I had to do that, i would just use shell action..... You all must > like > > enjoy shading jars. > > > > Edward > > > > On Thu, Sep 16, 2021 at 2:30 PM Chao Sun <sunc...@apache.org> wrote: > > > >> I'm not sure whether it is a good idea to remove `hive-exec-core` > >> completely - it is still being used today by some other popular projects > >> including Spark and Trino/Presto. By sticking to `hive-exec-core` it > gives > >> more flexibility to the other projects to shade & relocate those classes > >> according to their need, without waiting for new Hive releases. Hive > also > >> needs to make sure it relocate everything properly. Otherwise, if some > >> classes are shaded & included in `hive-exec` but not relocated, there > is no > >> way for the other projects to exclude them and avoid potential > conflicts. > >> > >> Chao > >> > >> On Thu, Sep 16, 2021 at 8:03 AM Zoltan Haindrich <k...@rxd.hu> wrote: > >> > >>> Hey > >>> > >>> On 9/6/21 12:48 PM, Stamatis Zampetakis wrote: > >>>> Indeed this may lead to binary incompatibility problems as the one you > >>>> mentioned. If I understood correctly the problem you cite comes up if > >>>> library B in this case is not relocated. If Hive systematically > >> relocates > >>>> shaded deps do you think there will still be binary incompatibility > >>> issues? > >>>> > >>>> If the relocating solution works, I would personally prefer going down > >>> this > >>>> path instead of introducing an entirely new module just for the sake > of > >>>> dependency management. Most of the time when there are problems with > >>>> shading the answer comes from relocating the problematic dependencies > >> and > >>>> people are more or less accustomed with this route. > >>> > >>> I totally agree with you Stamatis - with the addition that we should > work > >>> together with the owners of other projects to help them use the correct > >>> artifact to gain access to > >>> Hive's internal parts. > >>> I've opened HIVE-25531 to remove the core classified artifact - and > >> ensure > >>> that we will be uncovering and fixing future issues with the hive-exec > >>> artifact. > >>> > >>> cheers, > >>> Zoltan > >>> > >>> > >>>> > >>>> Best, > >>>> Stamatis > >>>> > >>>> On Mon, Aug 30, 2021 at 9:49 PM Daniel Fritsi > >>> <fdan...@cloudera.com.invalid> > >>>> wrote: > >>>> > >>>>> Dear Hive developers, > >>>>> > >>>>> I am Dan from the Oozie team and I would like to bring up the > >>>>> hive-exec.jar vs. hive-exec-core.jar topic. > >>>>> The reason for that is because as far as we understand the official > >>>>> recommendation from the Hive team is to use the hive-exec.jar > >> artifact. > >>>>> > >>>>> However in Oozie that can end-up in a binary incompatibility. > >>>>> > >>>>> The reason for that is: > >>>>> > >>>>> * Let's say library A is included in the fat Jar. > >>>>> > >>>>> * And library B which is using library A is also included in the > >> fat > >>> Jar. > >>>>> > >>>>> * Let's also say that library A's com.library.alib package is > >>>>> relocated to org.apache.hive.com.library.alib, > >>>>> meaning the com.library.alib.SomeClass becomes > >>>>> org.apache.hive.com.library.alib.SomeClass > >>>>> > >>>>> * So if B has a method like public void > >>>>> someMethod(com.library.alib.SomeClass) then the signature of > this > >>>>> method will be changed to: > >>>>> public void > >> someMethod(org.apache.hive.com.library.alib.SomeClass) > >>>>> > >>>>> * If Oozie is also using B directly meaning we'll have b.jar on > our > >>>>> classpath, but with the unchanged signature, > >>>>> so when hive-exec tries to invoke someMethod then depending on > >>>>> whether b.jar coming from us will be loaded first or hive-exec > >>> will, > >>>>> we can end-up with a NoSuchMethodError is hive-exec tries to > pass > >>> an > >>>>> org.apache.hive.com.library.alib.SomeClass instance to the > >>>>> someMethod which was loaded from the original b.jar. > >>>>> > >>>>> Hence in Oozie a long time ago (OOZIE-2621 > >>>>> <https://issues.apache.org/jira/browse/OOZIE-2621>) the decision was > >>>>> made to use the hive-exec-core Jar. > >>>>> > >>>>> Now since the shading process actually removes those dependencies > from > >>>>> the hive-exec pom which are included in the fat Jar, we manually had > >> to > >>>>> add some dependencies to Oozie to compensate this. > >>>>> However these dependencies are not used by Oozie directly and with > the > >>>>> growing features of hive-exec we had to repeat the same process > >>>>> over-and-over which is a bit unmaintainable. > >>>>> > >>>>> Today I'm writing to you to propose a long-term solution where > >> basically > >>>>> nothing would change in the generated hive artifacts, poms and the > >> same > >>>>> time we wouldn't have to manually declare dependencies in Oozie which > >>>>> are not explicitly used by us. > >>>>> > >>>>> The solution: > >>>>> > >>>>> 1. We would create a new module named hive-exec-dependencies which > >>>>> would be a pom-packaging module without any Java source files. > >>>>> 2. All the dependencies declared in hive-exec would be moved to > >>>>> hive-exec-dependencies. > >>>>> 3. We would make the hive-exec-dependencies module the parent of > >>>>> hive-exec and with this hive-exec would still have access to > the > >>>>> same dependencies as before. > >>>>> 4. The maven shade plugin would still strip the dependencies from > >> the > >>>>> generated hive-exec pom which are included in the fat Jar. > >>>>> 5. And with a small maven plugin we'd change hive-exec's parent > back > >>>>> from hive-exec-dependencies to the root hive project in the > >>>>> generated hive-exec pom file. > >>>>> > >>>>> I have a change ready locally and it works as described above. > >>>>> > >>>>> With this on the Oozie side we could add a dependency on > >>>>> hive-exec-dependencies and hence all the required libraries which are > >>>>> included in the fat Jar would be pulled into Oozie. > >>>>> The next time a new dependency would be added to > >> hive-exec-dependencies, > >>>>> the Oozie build would pull it in automatically without us having to > >>>>> explicitly declare it. > >>>>> > >>>>> Please let me know what you think. > >>>>> > >>>>> Best, > >>>>> Dan > >>>>> > >>>> > >>> > >> > > >