Hi Mich,
Thank you for your input.
Does monotonically incremental ensure about race condition and does it
duplicates the ids at some points with multi threads, multi instances, ... ?

Even System.currentTimeMillis() still has duplication?

Cheers,
Kevin.

On Mon, Sep 5, 2016 at 12:30 AM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> You can create a monotonically incrementing ID column on your table
>
> scala> val ll_18740868 = spark.table("accounts.ll_18740868")
> scala> val startval = 1
> scala> val df = ll_18740868.withColumn("id",
> *monotonically_increasing_id()+* startval).show (2)
> +---------------+---------------+---------+-------------+---
> -------------------+-----------+------------+-------+---+
> |transactiondate|transactiontype| sortcode|accountnumber|
> transactiondescription|debitamount|creditamount|balance| id|
> +---------------+---------------+---------+-------------+---
> -------------------+-----------+------------+-------+---+
> |     2011-12-30|            DEB|'30-64-72|     18740868|  WWW.GFT.COM CD
> 4628 |       50.0|        null| 304.89|  1|
> |     2011-12-30|            DEB|'30-64-72|     18740868|
> TDA.CONFECC.D.FRE...|      19.01|        null| 354.89|  2|
> +---------------+---------------+---------+-------------+---
> -------------------+-----------+------------+-------+---+
>
>
> Now you have a new ID column
>
> HTH
>
>
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 4 September 2016 at 12:43, Kevin Tran <kevin...@gmail.com> wrote:
>
>> Hi everyone,
>> Please give me your opinions on what is the best ID Generator for ID
>> field in parquet ?
>>
>> UUID.randomUUID();
>> AtomicReference<Long> currentTime = new AtomicReference<>(System.curre
>> ntTimeMillis());
>> AtomicLong counter = new AtomicLong(0);
>> ....
>>
>> Thanks,
>> Kevin.
>>
>>
>> ----
>> https://issues.apache.org/jira/browse/SPARK-8406 (Race condition when
>> writing Parquet files)
>> https://github.com/apache/spark/pull/6864/files
>>
>
>

Reply via email to