In case if we could return List<ABC> from DoFn<> then we could use the code as 
suggested in section 3.1.2 and mentioned by you below., but the return type of 
DoFn<> is always PCollection<> in where I could not have the list of ABC 
objects which further will be fed as input for parallel computation. Is there 
any possibility to convert List<ABC> to PCollection<ABC> in DoFn<> itself? OR 
can DoFn<> return List<ABC> objects?


Cheers,
S. Sahayaraj

From: Robert Bradshaw [mailto:rober...@google.com]
Sent: Wednesday, June 6, 2018 9:40 PM
To: user@beam.apache.org
Subject: [EXTERNAL] - Re: Create PCollection<ABC> from List<ABC>in DoFn<>

You can use the Create transform to do this, e.g.

  Pipeline p = ...
  List<ABC> inMemoryObjects = ...
  PCollection<ABC> pcollectionOfObject = p.apply(Create.of(inMemoryObjects));
  result = pcollectionOfObject.apply(ParDo.of(SomeDoFn...));

See section 3.1.2 at 
https://beam.apache.org/documentation/programming-guide/#pcollections

On Wed, Jun 6, 2018 at 8:34 AM S. Sahayaraj 
<ssahaya...@quark.com<mailto:ssahaya...@quark.com>> wrote:
Hello,
                I have created a java class which extends DoFn<>, there are 
list of objects of type ABC (List<ABC>) populated in 
processElement(ProcessContext c) at runtime and would like to generate 
respective PCollection<ABC> from List<ABC> so that the subsequent 
transformation can do parallel execution on each ABC object in 
PCollection<ABC>. How do we create PCollection from in-memory object created in 
DoFn<>? OR How do we get pipeline object being in DoFn<>? OR is there any SDK 
guidelines to refer?


Thanks,
S. Sahayaraj

Reply via email to