Artur Sukhenko created SPARK-20244:
--------------------------------------

             Summary: Incorrect input size in UI with pyspark
                 Key: SPARK-20244
                 URL: https://issues.apache.org/jira/browse/SPARK-20244
             Project: Spark
          Issue Type: Bug
          Components: Web UI
    Affects Versions: 2.1.0, 2.0.0
            Reporter: Artur Sukhenko
            Priority: Minor


In Spark UI (Details for Stage) Input Size is  64.0 KB when running in 
PySparkShell. 
Also it is incorrect in Tasks table:
64.0 KB / 132120575 in pyspark
252.0 MB / 132120575 in spark-shell

I will attach screenshots.

Reproduce steps:
Run this  to generate big file (press Ctrl+C after 5-6 seconds)
$ yes > /tmp/yes.txt
$ hadoop fs -copyFromLocal /tmp/yes.txt /tmp/
$ ./bin/pyspark
{code}
Python 2.7.5 (default, Nov  6 2016, 00:28:07) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.1.0
      /_/

Using Python version 2.7.5 (default, Nov  6 2016 00:28:07)
SparkSession available as 'spark'.{code}
>>> a = sc.textFile("/tmp/yes.txt")
>>> a.count()


Open Spark UI and check Stage 0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to