" of items and hence produce more
> than one shard. See also
> https://beam.apache.org/documentation/runtime/model/
>
> On Thu, May 13, 2021 at 3:58 PM Wenbing Bai
> wrote:
>
>> Hi team,
>>
>> I have another question when using Beam Dataframe IO connect
> Kenn
>
> On Mon, May 10, 2021 at 5:26 PM Wenbing Bai
> wrote:
>
>> Hi Robert and Brian,
>>
>> I don't know why I didn't catch your replies. But thank you so much for
>> looking at this.
>>
>> My parquet files will be consumed by downstre
;> partition_cols should work, I filed BEAM-12201 [1] for this. That alone
>> won't be enough as our implementation will likely reshuffle the dataset to
>> enforce the partitioning, removing any sorting that you've applied, so we'd
>> also need to think about how to opti
https://github.com/apache/beam/blob/a8cd05932bed9b2480316fb8518409636cb2733b/sdks/python/apache_beam/dataframe/io.py#L525
>
> On Wed, Apr 7, 2021 at 2:22 PM Wenbing Bai
> wrote:
>
>> Hi Robert and Brian,
>>
>> I tried groupby in my case. Here is my pipeline code. I do see
{}.parquet'.format(str
(uuid.uuid4())[:8]), engine='pyarrow', index=False)
On Fri, Apr 2, 2021 at 10:00 AM Wenbing Bai
wrote:
> Thank you, Robert and Brian.
>
> I'd like to try this out. I am trying to distribute my dataset to nodes,
> sort each partition by some key and then store each
s are by definition unordered, so
>> unless you sort a partition and immediately do something with it that
>> ordering may not be preserved. If you could let us know what you're trying
>> to do with this ordering that would be helpful.
>>
>> - Robert
>>
>>
>> O
will be distributed to different nodes. I also tried df.sort_values,
but it will sort my whole dataset, which is not what I need.
Can someone shed some light on this?
Wenbing Bai
Senior Software Engineer
Data Infrastructure, Cruise
Pronouns: She/Her
--
*Confidentiality Note:* We care
(WriteRecordsToFile)/ParDo(WriteRecordsToFile)/ParDo(WriteRecordsToFile)']
Anyone who had this before? Can I get any hints on where Dataflow worker
writing data to avro?
--
Wenbing Bai
Senior Software Engineer, MLP
Cruise
Pronouns: She/Her
--
*Confidentiality Note:* We care about protecting our