Hello,

I mentored Arnaud to contribute the sketching extension into Beam and
from a quick look at Alex paper + implementation, I think this should
be an independent extension. Sketching is a collection of transforms
that rely on probabilistic data structures to give approximate results
and correspond clearly to the data sketching category.

Alex work is clearly a different area, it is more about data
preprocessing and feature extraction, so I think it should be in a
different module.
Agree 100% that the best option is to do a rewrite on Java, this also
has the advantage of easier maintainability. It would be really nice
to have a new extension for this in Beam so don't hesitate to ask in
the mailing list / slack if you have questions.

Regards,
Ismaël

On Mon, Oct 29, 2018 at 10:38 AM Maximilian Michels <m...@apache.org> wrote:
>
> Hey Alex,
>
> No need to reimplement. Java is the best option, since we don't
> currently have a Scala API in Beam.
>
> Cheers,
> Max
>
> On 25.10.18 21:50, Alex wrote:
> > Great! Right now there is a lot on that code I do not understand, hope in 
> > the next days I can document myself.
> >
> > Should I reimplement my algorithms in Scala? Or could I create a wrapper 
> > that interface with the sketching extension?
> >
> > Cheers.On Oct 24, 2018 15:00, Maximilian Michels <m...@apache.org> wrote:
> >>
> >> Welcome Alejandro! Interesting work. The sketching extension looks like
> >> a good place for your algorithms.
> >>
> >> -Max
> >>
> >> On 23.10.18 19:05, Lukasz Cwik wrote:
> >>> Arnoud Fournier (afourn...@talend.com <mailto:afourn...@talend.com>)
> >>> started by adding a library to support sketching
> >>> (https://github.com/apache/beam/tree/master/sdks/java/extensions/sketching),
> >>> I feel as those some of these could be added there or possibly within
> >>> another extension.
> >>>
> >>> On Tue, Oct 23, 2018 at 9:54 AM Austin Bennett
> >>> <whatwouldausti...@gmail.com <mailto:whatwouldausti...@gmail.com>> wrote:
> >>>
> >>>       Hi Beam Devs,
> >>>
> >>>       Alejandro, copied, is an enthusiastic developer, who recently coded 
> >>> up:
> >>>       https://github.com/elbaulp/DPASF (associated paper found:
> >>>       https://arxiv.org/abs/1810.06021).
> >>>
> >>>       He had been looking to contribute that code to FlinkML, at which
> >>>       point I found him and alerted him to Beam.  He has been learning a
> >>>       bit on Beam recently.  Would this data-preprocessing be a welcome
> >>>       contribution to the project.  If yes, perhaps others better versed
> >>>       in internals (I'm not there yet -- though could follow along!) would
> >>>       be willing to provide feedback to shape this to be a suitable Beam
> >>>       contribution.
> >>>
> >>>       Cheers,
> >>>       Austin
> >>>
> >>>

Reply via email to