How about running this -
select * from
(select * , count() over (partition by id order by id) c from filteredDS) f
where f.cnt < 7500
On Sun, Mar 5, 2017 at 12:05 PM, Ankur Srivastava <
ankur.srivast...@gmail.com> wrote:
> Yes every time I run this code with production scale data it fails.
Yes every time I run this code with production scale data it fails. Test case
with small dataset of 50 records on local box runs fine.
Thanks
Ankur
Sent from my iPhone
> On Mar 4, 2017, at 12:09 PM, ayan guha wrote:
>
> Just to be sure, can you reproduce the error using
Just to be sure, can you reproduce the error using sql api?
On Sat, 4 Mar 2017 at 2:32 pm, Ankur Srivastava
wrote:
> Adding DEV.
>
> Or is there any other way to do subtractByKey using Dataset APIs?
>
> Thanks
> Ankur
>
> On Wed, Mar 1, 2017 at 1:28 PM, Ankur
Adding DEV.
Or is there any other way to do subtractByKey using Dataset APIs?
Thanks
Ankur
On Wed, Mar 1, 2017 at 1:28 PM, Ankur Srivastava wrote:
> Hi Users,
>
> We are facing an issue with left_outer join using Spark Dataset api in 2.0
> Java API. Below is the
Hi Users,
We are facing an issue with left_outer join using Spark Dataset api in 2.0
Java API. Below is the code we have
Dataset badIds = filteredDS.groupBy(col("id").alias("bid")).count()
.filter((FilterFunction) row -> (Long) row.getAs("count") > 75000);
_logger.info("Id count with