Assigning a unique row ID

Everett Anderson Fri, 07 Apr 2017 15:57:01 -0700

Hi,

What's the best way to assign a truly unique row ID (rather than a hash) to
a DataFrame/Dataset?


I originally thought that functions.monotonically_increasing_id would do
this, but it seems to have a rather unfortunate property that if you add it
as a column to table A and then derive tables X, Y, Z and save those, the
row ID values in X, Y, and Z may end up different. I assume this is because
it delays the actual computation to the point where each of those tables is
computed.

Assigning a unique row ID

Reply via email to