I believe take on an empty dataset will return an empty Array rather than
throw an exception.

df.take(1).isEmpty should work

On Tue, 7 Mar 2017 at 07:42, Deepak Sharma <deepakmc...@gmail.com> wrote:

> If the df is empty , the .take would return
> java.util.NoSuchElementException.
> This can be done as below:
> df.rdd.isEmpty
>
>
> On Tue, Mar 7, 2017 at 9:33 AM, <jasbir.s...@accenture.com> wrote:
>
> Dataframe.take(1) is faster.
>
>
>
> *From:* ashaita...@nz.imshealth.com [mailto:ashaita...@nz.imshealth.com]
> *Sent:* Tuesday, March 07, 2017 9:22 AM
> *To:* user@spark.apache.org
> *Subject:* Check if dataframe is empty
>
>
>
> Hello!
>
>
>
> I am pretty sure that I am asking something which has been already asked
> lots of times. However, I cannot find the question in the mailing list
> archive.
>
>
>
> The question is – I need to check whether dataframe is empty or not. I
> receive a dataframe from 3rd party library and this dataframe can be
> potentially empty, but also can be really huge – millions of rows. Thus, I
> want to avoid of doing some logic in case the dataframe is empty. How can I
> efficiently check it?
>
>
>
> Right now I am doing it in the following way:
>
>
>
> *private def *isEmpty(df: Option[DataFrame]): Boolean = {
>   df.isEmpty || (df.isDefined && df.get.limit(1).*rdd*.isEmpty())
> }
>
>
>
> But the performance is really slow for big dataframes. I would be grateful
> for any suggestions.
>
>
>
> Thank you in advance.
>
>
>
>
> Best regards,
>
>
>
> Artem
>
>
> ------------------------------
>
> ********************** IMPORTANT--PLEASE READ ************************
> This electronic message, including its attachments, is CONFIDENTIAL and may
> contain PROPRIETARY or LEGALLY PRIVILEGED or PROTECTED information and is
> intended for the authorized recipient of the sender. If you are not the
> intended recipient, you are hereby notified that any use, disclosure,
> copying, or distribution of this message or any of the information included
> in it is unauthorized and strictly prohibited. If you have received this
> message in error, please immediately notify the sender by reply e-mail and
> permanently delete this message and its attachments, along with any copies
> thereof, from all locations received (e.g., computer, mobile device, etc.).
> Thank you.
> ************************************************************************
>
> ------------------------------
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
>
> ______________________________________________________________________________________
>
> www.accenture.com
>
>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>

Reply via email to