Please note that limit drops the partitions to 1.
If it is only 100 records you might be able to fit it in one executor , so
limit followed by a write is okay.
From: Brandon Geise
Sent: Sunday, April 14, 2019 9:54 AM
To: Chetan Khatri
Cc: Nuthan Reddy ; user
Subject: Re: How to
Use .limit on the dataframe followed by .write
On Apr 14, 2019, 5:10 AM, at 5:10 AM, Chetan Khatri
wrote:
>Nuthan,
>
>Thank you for reply. the solution proposed will give everything. for me
>is
>like one Dataframe show(100) in 3000 lines of Scala Spark code.
>However, yarn logs --applicationId
Nuthan,
Thank you for reply. the solution proposed will give everything. for me is
like one Dataframe show(100) in 3000 lines of Scala Spark code.
However, yarn logs --applicationId > 1.log also gives all
stdout and stderr.
Thanks
On Sun, Apr 14, 2019 at 10:30 AM Nuthan Reddy
wrote:
> Hi
Hi Chetan,
You can use
spark-submit showDF.py | hadoop fs -put - showDF.txt
showDF.py:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Write stdout").getOrCreate()
spark.sparkContext.setLogLevel("OFF")
spark.table("").show(100,truncate=false)
But is there any
Hello Users,
In spark when I have a DataFrame and do .show(100) the output which gets
printed, I wants to save as it is content to txt file in HDFS.
How can I do this?
Thanks