You will need replace the return value of the callback to iterator On Wed, Mar 29, 2017, 12:19 Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
> I tested a workaround with reflection and it seems to work (at least it > compiles > ;)). > > I will share the PR asap. > > Regards > JB > > On 03/29/2017 10:47 AM, Amit Sela wrote: > > Just tried to replace dependencies and see what happens: > > > > Most required changes are about the runner using deprecated Spark APIs, > and > > after fixing them the only real issue is with the Java API for > > Pair/FlatMapFunction that changed return value to Iterator (in 1.6 its > > Iterable). > > > > So I'm not sure that a profile that simply sets dependency on 1.6.3/2.1.0 > > is feasible. > > > > On Thu, Mar 23, 2017 at 10:22 AM Kobi Salant <kobi.sal...@gmail.com> > wrote: > > > >> So, if everything is in place in Spark 2.X and we use provided > dependencies > >> for Spark in Beam. > >> Theoretically, you can run the same code in 2.X without any need for a > >> branch? > >> > >> 2017-03-23 9:47 GMT+02:00 Amit Sela <amitsel...@gmail.com>: > >> > >>> If StreamingContext is valid and we don't have to use SparkSession, and > >>> Accumulators are valid as well and we don't need AccumulatorsV2, I > don't > >>> see a reason this shouldn't work (which means there are still tons of > >>> reasons this could break, but I can't think of them off the top of my > >> head > >>> right now). > >>> > >>> @JB simply add a profile for the Spark dependencies and run the tests - > >>> you'll have a very definitive answer ;-) . > >>> If this passes, try on a cluster running Spark 2 as well. > >>> > >>> Let me know of I can assist. > >>> > >>> On Thu, Mar 23, 2017 at 6:55 AM Jean-Baptiste Onofré <j...@nanthrax.net> > >>> wrote: > >>> > >>>> Hi guys, > >>>> > >>>> Ismaël summarize well what I have in mind. > >>>> > >>>> I'm a bit late on the PoC around that (I started a branch already). > >>>> I will move forward over the week end. > >>>> > >>>> Regards > >>>> JB > >>>> > >>>> On 03/22/2017 11:42 PM, Ismaël Mejía wrote: > >>>>> Amit, I suppose JB is talking about the RDD based version, so no need > >>>>> to worry about SparkSession or different incompatible APIs. > >>>>> > >>>>> Remember the idea we are discussing is to have in master both the > >>>>> spark 1 and spark 2 runners using the RDD based translation. At the > >>>>> same time we can have a feature branch to evolve the DataSet based > >>>>> translator (this one will replace the RDD based translator for spark > >> 2 > >>>>> once it is mature). > >>>>> > >>>>> The advantages have been already discussed as well as the possible > >>>>> issues so I think we have to see now if JB's idea is feasible and how > >>>>> hard would be to live with this while the DataSet version evolves. > >>>>> > >>>>> I think what we are trying to avoid is to have a long living branch > >>>>> for a spark 2 runner based on RDD because the maintenance burden > >>>>> would be even worse. We would have to fight not only with the double > >>>>> merge of fixes (in case the profile idea does not work), but also > >> with > >>>>> the continue evolution of Beam and we would end up in the long living > >>>>> branch mess that others runners have dealt with (e.g. the Apex > >> runner) > >>>>> > >>>>> > >>>> https://lists.apache.org/thread.html/12cc086f5ffe331cc70b89322ce541 > >>> 6c3112b87efc3393e3e16032a2@%3Cdev.beam.apache.org%3E > >>>>> > >>>>> What do you think about this Amit ? Would you be ok to go with it if > >>>>> JB's profile idea proves to help with the msintenance issues ? > >>>>> > >>>>> Ismaël > >>>>> > >>>>> > >>>>> > >>>>> On Wed, Mar 22, 2017 at 5:53 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >>>>>> hbase-spark module doesn't use SparkSession. So situation there is > >>>> simpler > >>>>>> :-) > >>>>>> > >>>>>> On Wed, Mar 22, 2017 at 5:35 AM, Amit Sela <amitsel...@gmail.com> > >>>> wrote: > >>>>>> > >>>>>>> I'm still wondering how we'll do this - it's not just different > >>>>>>> implementations of the same Class, but a completely different > >>> concepts > >>>> such > >>>>>>> as using SparkSession in Spark 2 instead of > >>>> SparkContext/StreamingContext > >>>>>>> in Spark 1. > >>>>>>> > >>>>>>> On Tue, Mar 21, 2017 at 7:25 PM Ted Yu <yuzhih...@gmail.com> > >> wrote: > >>>>>>> > >>>>>>>> I have done some work over in HBASE-16179 where compatibility > >>> modules > >>>> are > >>>>>>>> created to isolate changes in Spark 2.x API so that code in > >>>> hbase-spark > >>>>>>>> module can be reused. > >>>>>>>> > >>>>>>>> FYI > >>>>>>>> > >>>>>>> > >>>> > >>>> -- > >>>> Jean-Baptiste Onofré > >>>> jbono...@apache.org > >>>> http://blog.nanthrax.net > >>>> Talend - http://www.talend.com > >>>> > >>> > >> > > > > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com >