Hi, In Spark source code, Hadoop.scala (in RDD). Spark updates the information of total bytes read after every 1000 records. Displaying the bytes read along side the update function it shows 65536. Even if I change the code to update bytes read after every record it, it still shows 65536 multiple times till it reads 1000 or more records. Why is this so? Is it because of 65536 bytes is the minimum read (IP packet size is also 65536)? If not can I change the size a record can hold?
Thanks -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org