I have opened up a PR for BEAM-9008.  I wasn't sure if I should initiate
any 'checks' from CI on the PR, so please let me know if I need to and any
other changes/issues.  Thanks.

On Fri, Dec 20, 2019 at 7:20 AM Ismaël Mejía <ieme...@gmail.com> wrote:

> For ref this is the JIRA ticket
> https://issues.apache.org/jira/browse/BEAM-9008
> The improvement makes total sense and the change in the internal
> implementation from BoundedSource to ParDo has no backwards consequences
> for the final users so looks good. This connector does not support Dynamic
> Work Rebalancing so there won't be any difference at runtime and this
> refactor could be the base for a SDF based implementation.
>
> I added you as a contributor in JIRA and assigned the ticket to you
> Vincent. Great to see this one happening. Welcome to the project!
>
> Regards,
> Ismaël
>
> On Fri, Dec 20, 2019 at 5:48 AM Vincent Marquez <vincent.marq...@gmail.com>
> wrote:
>
>>
>>
>> On Thu, Dec 12, 2019 at 8:43 PM Kenneth Knowles <k...@apache.org> wrote:
>>
>>> On Thu, Dec 12, 2019 at 3:30 PM Vincent Marquez <
>>> vincent.marq...@gmail.com> wrote:
>>>
>>>> Hello, as I've mentioned in previous emails, I've found the CassandraIO
>>>> connector lacking some essential features for efficient batch processing in
>>>> real world scenarios.  We've developed a more fully featured connector and
>>>> had good results with it.
>>>>
>>>
>>> Fantastic!
>>>
>>>
>>>> Could I perhaps write up a JIRA proposal for some minor changes to the
>>>> current connector that might improve things?
>>>>
>>>
>>> Yes!
>>>
>>>
>>>> The  main pain point is the absense of a 'readAll' method as I
>>>> documented here:
>>>>
>>>> https://gist.github.com/vmarquez/204b8f44b1279fdbae97b40f8681bc25
>>>>
>>>> If I could write up a ticket, I don't mind submitting a small PR on GH
>>>> as well addressing this lack of functionality.  Thanks for your time.
>>>>
>>>
>>> This would be excellent. Since it seems you already have implemented and
>>> tested the functionality, a simple Jira with a title and description would
>>> be enough, and then open a PR linked to the Jira with a title like
>>> "[BEAM-1234567] Improve performance of CassandraIO"
>>>
>>
>> I should clarify a bit.  What has already been done and tested is a
>> custom connector that has a 'readAll' cassandraIO functionality, I did not
>> modify the existing beam connector.  However, I spent some time the last
>> couple days looking over the details of the current CassandraIO connector
>> to verify it would be doable for me to do add something similar and still
>> maintain all the current functionality.
>>
>> To share some code between both the 'read' and 'readAll' styles of
>> CassandraIO, I'd want to modify the current 'Source' based 'connector' to
>> be a 'ParDo' based one, so there is a minor (in my opinon, relative to the
>> project) refactor involved.  I'm happy to explain in more detail in the
>> JIRA.
>>
>> Thank you for writing to dev@ to share your experience and intentions.
>>> We are happy to help you with the Jira and PR, and find the best reviewers,
>>> if you will open them to get started.
>>>
>>> Kenn
>>>
>>
>> Thank you!
>>
>>
>>
>>>
>>>> *-Vincent*
>>>>
>>>
>>
>>

-- 
*-Vincent*

Reply via email to