[
https://issues.apache.org/jira/browse/BEAM-13421?focusedWorklogId=697528&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-697528
]
ASF GitHub Bot logged work on BEAM-13421:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Dec/21 21:57
Start Date: 16/Dec/21 21:57
Worklog Time Spent: 10m
Work Description: codecov[bot] edited a comment on pull request #16258:
URL: https://github.com/apache/beam/pull/16258#issuecomment-996201095
#
[Codecov](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
Report
> Merging
[#16258](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
(67818cd) into
[master](https://codecov.io/gh/apache/beam/commit/15048929495ad66963b528d5bd71eb7b4a844c96?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
(1504892) will **increase** coverage by `37.52%`.
> The diff coverage is `100.00%`.
[](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #16258 +/- ##
===========================================
+ Coverage 46.13% 83.66% +37.52%
===========================================
Files 197 447 +250
Lines 19519 61705 +42186
===========================================
+ Hits 9006 51626 +42620
- Misses 9542 10079 +537
+ Partials 971 0 -971
```
| [Impacted
Files](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
| Coverage Δ | |
|---|---|---|
|
[sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==)
| `94.90% <100.00%> (ø)` | |
|
[sdks/go/pkg/beam/provision/provision.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9wcm92aXNpb24vcHJvdmlzaW9uLmdv)
| | |
|
[sdks/go/pkg/beam/core/graph/scope.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL3Njb3BlLmdv)
| | |
|
[sdks/go/pkg/beam/core/util/reflectx/structs.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcmVmbGVjdHgvc3RydWN0cy5nbw==)
| | |
|
[...pkg/beam/runners/dataflow/dataflowlib/translate.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL3RyYW5zbGF0ZS5nbw==)
| | |
|
[sdks/go/pkg/beam/core/runtime/graphx/dataflow.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L2RhdGFmbG93Lmdv)
| | |
|
[sdks/go/pkg/beam/metrics.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9tZXRyaWNzLmdv)
| | |
|
[sdks/go/pkg/beam/core/runtime/metricsx/urns.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvbWV0cmljc3gvdXJucy5nbw==)
| | |
|
[sdks/go/pkg/beam/artifact/materialize.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9hcnRpZmFjdC9tYXRlcmlhbGl6ZS5nbw==)
| | |
|
[sdks/go/pkg/beam/testing/passert/equals.go](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90ZXN0aW5nL3Bhc3NlcnQvZXF1YWxzLmdv)
| | |
| ... and [635
more](https://codecov.io/gh/apache/beam/pull/16258/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
| |
------
[Continue to review full report at
Codecov](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn
more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by
[Codecov](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
Last update
[1504892...67818cd](https://codecov.io/gh/apache/beam/pull/16258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
Read the [comment
docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 697528)
Time Spent: 0.5h (was: 20m)
> Python DeferredDataFrame.xs differs from Pandas
> -----------------------------------------------
>
> Key: BEAM-13421
> URL: https://issues.apache.org/jira/browse/BEAM-13421
> Project: Beam
> Issue Type: Bug
> Components: dsl-dataframe
> Affects Versions: 2.34.0
> Environment: Tested in Jupyter Notebook running in Docker.
> The docker file is produced by a modified version of
> https://github.com/fozziethebeat/gpu-jupyter/blob/master/.build/Dockerfile
> Reporter: Keith Stevens
> Assignee: Brian Hulette
> Priority: P2
> Fix For: 2.36.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> When testing the `xs` method on DeferredDataFrames I'm seeing a few
> inconsistent results. I have two minimal examples that showcase the errors.
>
> First inconsistency: Beam's `xs` requries one left over index field while
> Pandas does not.
> {code:java}
> with beam.Pipeline(options=PipelineOptions()) as pipeline:
> df = pd.DataFrame(
> np.array([
> ['state', 'day1', 12],
> ['state', 'day1', 1],
> ['state', 'day2', 14],
> ['county', 'day1', 9],
> ]),
> columns=['provider', 'time', 'value'])
> # Create just one index field
> df = df.set_index(['provider'])
> df.to_parquet('test.parquet')
>
> # Should print out
> # time value
> # provider
> # state day1 12
> # state day1 1
> # state day2 14
> print(df.xs('state'))
>
> # Should emit the same data to a csv but instead dies due to
> # Cannot remove 1 levels from an index with 1 levels: at least one level
> must be left.
> test_df = (pipeline | read_parquet('test.parquet'))
> (
> test_df.xs('state').to_csv('test.csv')
> ) {code}
> Second inconsistency: Beam dies for no clear reason
> {code:java}
> import pandas as pd
> import numpy as npwith beam.Pipeline(options=PipelineOptions()) as pipeline:
> df = pd.DataFrame(
> np.array([
> ['state', 'day1', 12],
> ['state', 'day1', 1],
> ['state', 'day2', 14],
> ['county', 'day1', 9],
> ]),
> columns=['provider', 'time', 'value'])
> # Create two index fields to satisfy Beam
> df = df.set_index(['provider', 'time'])
> df.to_parquet('test.parquet')
>
> # Should print out
> # value
> # time
> # day1 12
> # day1 1
> # day2 14
> print(df.xs('state'))
>
> # Dies with no clear error at
> #
> /opt/conda/lib/python3.9/site-packages/apache_beam/dataframe/transforms.py in
> output_partitioning_in_stage(expr, stage)
> # 305
> # 306 # Anything that's not an input must have arguments
> # 307 assert len(expr.args())
> # 308
> # 309 arg_partitionings = set(
> test_df = (pipeline | read_parquet('test.parquet'))
> (
> test_df.xs('state').to_csv('test.csv')
> ) {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)