[ https://issues.apache.org/jira/browse/BEAM-7018?focusedWorklogId=284304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-284304 ]
ASF GitHub Bot logged work on BEAM-7018: ---------------------------------------- Author: ASF GitHub Bot Created on: 29/Jul/19 13:53 Start Date: 29/Jul/19 13:53 Worklog Time Spent: 10m Work Description: mszb commented on issue #8859: [BEAM-7018] Added Regex transform for PythonSDK URL: https://github.com/apache/beam/pull/8859#issuecomment-516001151 okay. > Someone using Regex.find_iter might expect match objects, just as in Python, so I'd avoid that naming. You might need to use re.finditer to implement Regex.find_all though, due to Regex.findall's variant signature. > […](#) > On Mon, Jul 29, 2019 at 3:40 PM Shoaib Zafar ***@***.***> wrote: Thanks for the feedback @robertwb <https://github.com/robertwb>. This approach seems good, I'll update the code! Only one questions though, What I think, we create Regex.find_all which just Map re.findall and put all approach you mentioned above in the Regex.find_iter method! Because re.findall returns a list of all the groups (re.findall("a(b*)", "abb ax abbb") >> ['bb', '', 'bbb']) whereas in the above example we are going to return a list of group(0) ["abb", "a", "abbb"]. Your thoughts? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8859?email_source=notifications&email_token=AADWVAIM36YNCUXHB2SCR4TQB3XL7A5CNFSM4HYGZUYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3AXP5A#issuecomment-515995636>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADWVAIEP3UFNJ7DEHW7Q4LQB3XL7ANCNFSM4HYGZUYA> . ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 284304) Time Spent: 10h 10m (was: 10h) > Regex transform for Python SDK > ------------------------------ > > Key: BEAM-7018 > URL: https://issues.apache.org/jira/browse/BEAM-7018 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core > Reporter: Rose Nguyen > Assignee: Shehzaad Nakhoda > Priority: Minor > Time Spent: 10h 10m > Remaining Estimate: 0h > > PTransorms to use Regular Expressions to process elements in a PCollection > It should offer the same API as its Java counterpart: > [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java] -- This message was sent by Atlassian JIRA (v7.6.14#76016)