>
> You can try scoping the string builder instance to processElement, instead
> of making it a member of your DoFn.
>
I tried to create a StringBuilder in beamRow2CsvLine function too.  But it
has a similar issue. I put StringBuilder on Setup to reuse the same object
per bundle to reduce object recreation. AFAIK setup is only called by a
single thread. Based on my tests, reusing StringBuilder
increases the performance +25%.


My logic is actually simple: I need to convert Beam Row to csv string row.
I can try Brain's suggestion.



On Wed, Sep 2, 2020 at 3:11 PM Brian Hulette <bhule...@google.com> wrote:

> That error isn't exactly an OOM, it indicates the JVM is spending a
> significant amount of time in garbage collection.
>
> It looks like `writer.setLength(0)` may actually allocate a new buffer,
> and then the buffer may also need to be resized as the String grows, so you
> could be creating a lot of orphaned buffers very quickly. I'm not that
> familiar with StringBuilder, is there a way to reset it and re-use the
> existing capacity? Maybe `writer.delete(0, writer.length())` [1]?
>
> [1]
> https://stackoverflow.com/questions/242438/is-it-better-to-reuse-a-stringbuilder-in-a-loop
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_242438_is-2Dit-2Dbetter-2Dto-2Dreuse-2Da-2Dstringbuilder-2Din-2Da-2Dloop&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=BkW1L6EF7ergAVYDXCo-3Vwkpy6qjsWAz7_GD7pAR8g&m=EFsNtBMh3aQSH1MXWx0-YmpRIgUHj6EfHvulHdoBkdw&s=H5Fupf1d3R199PF73T8D8YYAnipbS3YdJKj_4Ep2-DU&e=>
>
> On Wed, Sep 2, 2020 at 3:02 PM Talat Uyarer <tuya...@paloaltonetworks.com>
> wrote:
>
>> Sorry for the wrong import. You can see on the code I am using
>> StringBuilder.
>>
>> On Wed, Sep 2, 2020 at 2:55 PM Ning Kang <ni...@google.com> wrote:
>>
>>> Here is a question answered on StackOverflow:
>>> https://stackoverflow.com/questions/27221292/when-should-i-use-javas-stringwriter
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_27221292_when-2Dshould-2Di-2Duse-2Djavas-2Dstringwriter&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=BkW1L6EF7ergAVYDXCo-3Vwkpy6qjsWAz7_GD7pAR8g&m=mVBqxC5kNOARPduF-c17S1VnIw8gwS6alvgONJKfheY&s=ggveahdPKo3vaAhADvjz4ucjndSmzyOZ8FPBvJ_0oZQ&e=>
>>>
>>> Could you try using StringBuilder instead since the usage is not
>>> appropriate for a StringWriter?
>>>
>>>
>>> On Wed, Sep 2, 2020 at 2:49 PM Talat Uyarer <
>>> tuya...@paloaltonetworks.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have an issue with String Concatenating. You can see my code
>>>> below.[1] I have a step on my df job which is concatenating strings. But
>>>> somehow when I use that step my job starts getting jvm restart errors.
>>>>
>>>>  Shutting down JVM after 8 consecutive periods of measured GC
>>>>> thrashing. Memory is used/total/max = 4112/5994/5994 MB, GC last/max =
>>>>> 97.36/97.36 %, #pushbacks=3, gc thrashing=true. Heap dump not written.
>>>>
>>>>
>>>> And also I try to use Avro rather than String. When I use Avro, it
>>>> works fine without any issue. Do you have any suggestions?
>>>>
>>>> Thanks
>>>>
>>>> [1] https://dpaste.com/7RTV86WQC
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__dpaste.com_7RTV86WQC&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=BkW1L6EF7ergAVYDXCo-3Vwkpy6qjsWAz7_GD7pAR8g&m=mVBqxC5kNOARPduF-c17S1VnIw8gwS6alvgONJKfheY&s=eSd0NcP8fw5BOZlSXtUMRfYuGWlN-gcXENVwgCmrapY&e=>
>>>>
>>>>
>>>>

Reply via email to