Given that the 2.6.0 cut is supposed to be today (or next days), what
is the status on this, has it been identified / reverted ? or is there
any other plan ?
On Tue, Jul 10, 2018 at 2:50 PM Reuven Lax <re...@google.com> wrote:
>
> If we added something slow to the core library in order to better test
> DirectRunner, that does sound like an unfortunate bug.
>
> On Mon, Jul 9, 2018 at 11:21 PM Vojtech Janota <vojta.jan...@gmail.com> wrote:
>>
>> Hi guys,
>>
>> Thank you for all of your feedback. I have created relevant issue in JIRA:
>> https://issues.apache.org/jira/browse/BEAM-4750
>>
>> @Lukasz: me mentioning the DirectRunner was somewhat unfortunate - the
>> bottleneck was introduced into the core library and so Flink and Spark
>> runners would be impacted too
>>
>> Thanks,
>> Vojta
>>
>> On Mon, Jul 9, 2018 at 5:48 PM, Lukasz Cwik <lc...@google.com> wrote:
>>>
>>> Instead of reverting/working around specific checks/tests that the
>>> DirectRunner is doing, have you considered using one of the other runners
>>> like Flink or Spark with a local execution cluster. You won't hit the
>>> validation/verification bottlenecks that DirectRunner specifically imposes.
>>>
>>> On Mon, Jul 9, 2018 at 8:46 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>>> wrote:
>>>>
>>>> Thanks for the update Eugene.
>>>>
>>>> @Vojta: do you mind to create a Jira ? I will tackle a fix for that.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 09/07/2018 17:33, Eugene Kirpichov wrote:
>>>> > Hi -
>>>> >
>>>> > If I remember correctly, the reason for this change was to ensure that
>>>> > the state is encodable at all. Prior to the change, there had been
>>>> > situations where the coder specified on a state cell is buggy, absent or
>>>> > set incorrectly (due to some issue in coder inference), but direct
>>>> > runner did not detect this because it never tried to encode the state
>>>> > cells - this would have blown up in any distributed runner.
>>>> >
>>>> > I think it should be possible to relax this and clone only values being
>>>> > added to the state, rather than cloning the whole state on copy(). I
>>>> > don't have time to work on this change myself, but I can review a PR if
>>>> > someone else does.
>>>> >
>>>> > On Mon, Jul 9, 2018 at 8:28 AM Jean-Baptiste Onofré <j...@nanthrax.net
>>>> > <mailto:j...@nanthrax.net>> wrote:
>>>> >
>>>> > Hi Vojta,
>>>> >
>>>> > I fully agree, that's why it makes sense to wait Eugene's feedback.
>>>> >
>>>> > I remember we had some performance regression on the direct runner
>>>> > identified thanks to Nexmark, but it has been addressed by reverting
>>>> > a
>>>> > change.
>>>> >
>>>> > Good catch anyway !
>>>> >
>>>> > Regards
>>>> > JB
>>>> >
>>>> > On 09/07/2018 17:20, Vojtech Janota wrote:
>>>> > > Hi Reuven,
>>>> > >
>>>> > > I'm not really complaining about DirectRunner. In fact it seems to
>>>> > me as
>>>> > > if what previously was considered as part of the "expensive extra
>>>> > > checks" done by the DirectRunner is now done within the
>>>> > > beam-runners-core-java library. Considering that all objects
>>>> > involved
>>>> > > are immutable (in our case at least) and simple assignment is
>>>> > > sufficient, the serialization-deserialization really seems as
>>>> > unwanted
>>>> > > and hugely expensive correctness check. If there was a problem with
>>>> > > identity copy, wasn't DirectRunner supposed to reveal it?
>>>> > >
>>>> > > Regards,
>>>> > > Vojta
>>>> > >
>>>> > > On Mon, Jul 9, 2018 at 4:46 PM, Reuven Lax <re...@google.com
>>>> > <mailto:re...@google.com>
>>>> > > <mailto:re...@google.com <mailto:re...@google.com>>> wrote:
>>>> > >
>>>> > > Hi Vojita,
>>>> > >
>>>> > > One problem is that the DirectRunner is designed for testing,
>>>> > not
>>>> > > for performance. The DirectRunner currently does many
>>>> > > purposely-inefficient things, the point of which is to better
>>>> > expose
>>>> > > potential bugs in tests. For example, the DirectRunner will
>>>> > randomly
>>>> > > shuffle the order of PCollections to ensure that your code
>>>> > does not
>>>> > > rely on ordering. All of this adds cost, because the current
>>>> > runner
>>>> > > is designed for testing. There have been requests in the past
>>>> > for an
>>>> > > "optimized" local runner, however we don't currently have such
>>>> > a thing.
>>>> > >
>>>> > > In this case, using coders to clone values is more correct. In
>>>> > a
>>>> > > distributed environment using encode/decode is the only way to
>>>> > copy
>>>> > > values, and the DirectRunner is trying to ensure that your
>>>> > code is
>>>> > > correct in a distributed environment.
>>>> > >
>>>> > > Reuven
>>>> > >
>>>> > > On Mon, Jul 9, 2018 at 7:22 AM Vojtech Janota
>>>> > > <vojta.jan...@gmail.com <mailto:vojta.jan...@gmail.com>
>>>> > <mailto:vojta.jan...@gmail.com <mailto:vojta.jan...@gmail.com>>>
>>>> > wrote:
>>>> > >
>>>> > > Hi,
>>>> > >
>>>> > > We are using Apache Beam in our project for some time now.
>>>> > Since
>>>> > > our datasets are of modest size, we have so far used
>>>> > > DirectRunner as the computation easily fits onto a single
>>>> > > machine. Recently we upgraded Beam from 2.2 to 2.4 and
>>>> > found out
>>>> > > that performance of our pipelines drastically deteriorated.
>>>> > > Pipelines that took ~3 minutes with 2.2 do not finish
>>>> > within
>>>> > > hours now. We tried to isolate the change that causes the
>>>> > > slowdown and came to the commits into the
>>>> > > "InMemoryStateInternals" class:
>>>> > >
>>>> > > * https://github.com/apache/beam/commit/32a427c
>>>> > > <https://github.com/apache/beam/commit/32a427c>
>>>> > > * https://github.com/apache/beam/commit/8151d82
>>>> > > <https://github.com/apache/beam/commit/8151d82>
>>>> > >
>>>> > > In a nutshell where previously the copy() method simply
>>>> > assigned:
>>>> > >
>>>> > > that.value = this.value
>>>> > >
>>>> > > There is now coder encode/decode combo hidden behind:
>>>> > >
>>>> > > that.value = uncheckedClone(coder, this.value)
>>>> > >
>>>> > > Can somebody explain the purpose of this change? Is it
>>>> > meant as
>>>> > > an additional "enforcement" point, similar to
>>>> > DirectRunner's
>>>> > > enforceImmutability and enforceEncodability? Or is it
>>>> > something
>>>> > > that is genuinely needed to provide correct behaviour of
>>>> > the
>>>> > > pipeline?
>>>> > >
>>>> > > Any hints or thoughts are appreciated.
>>>> > >
>>>> > > Regards,
>>>> > > Vojta
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> >
>>>> > --
>>>> > Jean-Baptiste Onofré
>>>> > jbono...@apache.org <mailto:jbono...@apache.org>
>>>> > http://blog.nanthrax.net
>>>> > Talend - http://www.talend.com
>>>> >
>>>>
>>>> --
>>>> Jean-Baptiste Onofré
>>>> jbono...@apache.org
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>
>>