robertwb opened a new pull request #13082: URL: https://github.com/apache/beam/pull/13082
When running on https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html Before: ``` 250 total test cases: 0 skipped (0.0%) 4 won't implement (1.6%) 3 order-sensitive (75.0%) 1 Conversion to a non-deferred a numpy array. (25.0%) 26 not implemented (yet) (10.4%) 9 NameError following NotImplementedError (34.6%) 5 'index' is not yet supported (BEAM-9547) (19.2%) 5 GroupBy.agg currently only supports callable arguments (19.2%) 1 [Grouper(level=1, axis=0, sort=False), 'A'] (3.8%) 1 [Grouper(level='second', axis=0, sort=False), 'A'] (3.8%) 1 ['second', 'A'] (3.8%) 1 Traceback (most recent call last):\n File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/doctest.py", line 1329, in __run\n compileflags, 1), test.globs)\n File "<doctest /Users/robertwb/.apache_beam/cache/pandas-1.1.1/doc/source/user_guide/groupby.rst[127]>", line 1, in <module>\n grouped = data_df.groupby(key)\n File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/frames.py", line 441, in groupby\n [self.set_index(by)._expr],\n File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/frame_base.py", line 303, in wrapper\n return func(**kwargs)\n File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/frame_base.py", line 334, in wrapper\n return func(**kwargs)\n File "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/frame_base.py", line 282, in wrapper\n result = func(self, **kwargs)\n File "/U sers/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/frames.py", line 490, in set_index\n raise NotImplementedError(keys)\nNotImplementedError: ['US' ,,, 'UK']\n (3.8%) 1 [TimeGrouper(key='Date', freq=<MonthEnd>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), 'Buyer'] (3.8%) 1 [TimeGrouper(key='Date', freq=<6 * MonthEnds>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), 'Buyer'] (3.8%) 1 [TimeGrouper(level='Date', freq=<6 * MonthEnds>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), 'Buyer'] (3.8%) 104 failed (41.6%) 116 passed (46.4%) ``` After ``` 250 total test cases: 0 skipped (0.0%) 15 won't implement (6.0%) 9 NameError following apache_beam.dataframe.frame_base.WontImplementError (60.0%) 3 non-deferred (20.0%) 1 order sensitive (6.7%) 1 Conversion to a non-deferred a numpy array. (6.7%) 1 order-sensitive (6.7%) 51 not implemented (yet) (20.4%) 16 NameError following NotImplementedError (31.4%) 14 'get_group' is not yet supported (BEAM-9547) (27.5%) 6 'order sensitive' is not yet supported (BEAM-9547) (11.8%) 5 GroupBy.agg currently only supports callable arguments (9.8%) 3 groupby(as_index=False) (5.9%) 1 [Grouper(level=1, axis=0, sort=False), 'A'] (2.0%) 1 [Grouper(level='second', axis=0, sort=False), 'A'] (2.0%) 1 'rolling' is not yet supported (BEAM-9547) (2.0%) 1 [TimeGrouper(key='Date', freq=<MonthEnd>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), 'Buyer'] (2.0%) 1 [TimeGrouper(key='Date', freq=<6 * MonthEnds>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), 'Buyer'] (2.0%) 1 [TimeGrouper(level='Date', freq=<6 * MonthEnds>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), 'Buyer'] (2.0%) 1 index.year (2.0%) 49 failed (19.6%) 135 passed (54.0%) ``` Most of what remains is agg for multiple aggregations, which will be a future PR. ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) ------------------------------------------------------------------------------------------------ Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | --- XLang | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | --- | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | --- Pre-Commit Tests Status (on master branch) ------------------------------------------------------------------------------------------------ --- |Java | Python | Go | Website | Whitespace | Typescript --- | --- | --- | --- | --- | --- | --- Non-portable | [](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/) Portable | --- | [](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | --- See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs. GitHub Actions Tests Status (on master branch) ------------------------------------------------------------------------------------------------ [](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule) See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
