On Wed, Jul 21, 2021 at 10:37 AM Andrew Kettmann <akettm...@evolve24.com>
wrote:

> Worth noting that you never "lose" a PCollection. You can use the same
> PCollection in as many transforms as you like and every time you reference
> that PCollection<A> it will be in the same state it was when you first read
> it in.
>
> So if you have:
>
> PCollection<A> colA = ...;
> PCollection<RedisData> = colA.apply(ParDo.of(new ReadRedisDataDoFn());
>
> You have not consumed the colA PCollection and can reference/use it as
> many times as you want in further steps.
>
> My instinct for this is:
>
>
>    1. Read Source to get PCollection<A>
>    2. Pull the key to look up in Redis from Pcollection<A> into another
>    PCollection
>    3. Look up with a custom DoFn if the normal IO one doesn't meet your
>    needs
>    4. CoGroupByKey transform to group them together
>
>
I have done that, however, this doesn't really work for my use case in a
streaming pipeline.  Both of the PCollections need to have the same
windowing and under high load if I don't want to buffer a ton of data I
might get outputs with one side being empty.


>
>    1. Do Whatever else you need to do with the combined data.
>
>
> ------------------------------
> *From:* Vincent Marquez <vincent.marq...@gmail.com>
> *Sent:* Wednesday, July 21, 2021 12:14 PM
> *To:* user <user@beam.apache.org>
> *Subject:* Mapping *part* of a PCollection possible? (Lens optics for
> PCollection?)
>
> Let's say I have PCollection<A> and I want to use the 'readAll' pattern to
> enhance some data from an additional source such as redis (which has a
> readKeys PTransform<String, RedisData>).  However I don't want to 'lose'
> the original A.  There *are* a few ways to do this currently (side inputs,
> joining two streams with CoGroupByKey, using State) all of which have some
> problems.
>
> If I could map PCollection<A> into some type that has <A, String> for
> instance PCollection<KV<A, String>>, then use the redis readKeys to map to
> PCollection<KV<A, RedisData>> this solves all my problems. This is more or
> less a get/set lens optic if anyone is familiar with functional
> programming.
>
> Is something like this possible?  Could it be added?  I've run
> into wanting this pattern numerous times over the last year.
>
>
> *~Vincent*
>
> evolve24 Confidential & Proprietary Statement: This email and any
> attachments are confidential and may contain information that is
> privileged, confidential or exempt from disclosure under applicable law. It
> is intended for the use of the recipients. If you are not the intended
> recipient, or believe that you have received this communication in error,
> please do not read, print, copy, retransmit, disseminate, or otherwise use
> the information. Please delete this email and attachments, without reading,
> printing, copying, forwarding or saving them, and notify the Sender
> immediately by reply email. No confidentiality or privilege is waived or
> lost by any transmission in error.
>

Reply via email to