Hi Luke,

Aren't the coders supposed to be serializable? The doc on the Coder
interface has the following java doc comment, which seems to mean that they
should be, and most of the basic coders seem to serializable.

" {@link Coder} instances are serialized during job creation and
deserialized before use. This will generally be performed by serializing
the object via Java Serialization."

However "TimestampedValueCoder" does not have a default constructor so it
cannot be deserialized.  adding a private non-arg constructor would solve
the problem, but this may not be the only class that has this issue. I can
work on a pull request to add the private non-args constructors to coders
that have them missing if this was not done intentionally. WDYT?

Best Regards,
Pulasthi

On Fri, Nov 15, 2019 at 12:05 AM Pulasthi Supun Wickramasinghe <
pulasthi...@gmail.com> wrote:

> Hi Luke,
>
> That is the approach i am taking currently to handle the functions. I
> Might have to do the same for Coders as well since some coders have the
> same issue of not having default constructors.
>
> I also initially considered converting the pipeline into a JSON format and
> sending that over to the workers, Will take a look at the option you have
> mentioned since we do plan to implement a Portable pipeline runner for
> Twister2 as well. Thanks for the information
>
> Best Regards,
> Pulasthi
>
>
> On Thu, Nov 14, 2019 at 2:30 PM Luke Cwik <lc...@google.com> wrote:
>
>> You should create placeholders inside of your Twister2/OpenMPI
>> implementation that represent these functions and then instantiate actual
>> instances of them on the workers if you want to write your own pipeline
>> representation and format for OpenMPI/Twister2.
>>
>> Or consider converting the pipeline to its proto representation and
>> building a portable pipeline runner. This way you could run Go and Python
>> pipelines as well. The best example of this is the current Flink
>> integration[1]
>>
>> 1:
>> https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchPortablePipelineTranslator.java
>>
>> On Wed, Nov 13, 2019 at 7:44 PM Pulasthi Supun Wickramasinghe <
>> pulasthi...@gmail.com> wrote:
>>
>>> Hi Dev's
>>>
>>> Currently, the Pipeline class in Beam is not Serializable. This is not a
>>> problem for the current runners since the pipeline is translated and
>>> submitted through a centralized Driver like model. However, if the runner
>>> has a decentralized model similar to OpenMPI (MPI), which is also the case
>>> with Twister2, which I am developing a runner currently, it would have been
>>> better if the pipeline itself was Serializable.
>>>
>>> Currently, I am trying to transform the Pipeline into a Twister2 graph
>>> and then send over to the workers, however since there are some functions
>>> such as "SystemReduceFn" that are not serializable this also is somewhat
>>> troublesome.
>>>
>>> Was the decision to make Pipelines not Serializable made due to some
>>> specific reason or because all the current use cases did not present any
>>> valid requirement to make them Serializable?
>>>
>>> Best Regards,
>>> Pulasthi
>>> --
>>> Pulasthi S. Wickramasinghe
>>> PhD Candidate  | Research Assistant
>>> School of Informatics and Computing | Digital Science Center
>>> Indiana University, Bloomington
>>> cell: 224-386-9035 <(224)%20386-9035>
>>>
>>
>
> --
> Pulasthi S. Wickramasinghe
> PhD Candidate  | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
> cell: 224-386-9035
>


-- 
Pulasthi S. Wickramasinghe
PhD Candidate  | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
cell: 224-386-9035

Reply via email to