Hi Broadcast variables definitely store in the spark.memory.storageFraction .
1 If we go into the code of TorrentBroadcast.scala and writeBlocks method and navigates to BlockManager to MemoryStore . Desearlization of the variables occures in unroll memory and then transferred to storage memory . memoryManager.synchronized { releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, amount) val success = memoryManager.acquireStorageMemory(blockId, amount, MemoryMode.ON_HEAP) So definitely broadcast variables are stored in spark.memory.storageFraction . Can u explain how are u seeing smaller set of memory used on given executor for broadcast variables through UI ? Regards Pralabh Kumar On Thu, Jun 22, 2017 at 4:39 AM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > Satish, > > I agree - that was my impression too. However I am seeing a smaller set of > storage memory used on a given executor than the amount of memory required > for my broadcast variables. I am wondering if the statistics in the ui are > incorrect or if the broadcasts are simply not a part of that storage memory > fraction. > > Bryan Jeffrey > > Get Outlook for Android <https://aka.ms/ghei36> > > > > > On Wed, Jun 21, 2017 at 6:48 PM -0400, "satish lalam" < > satish.la...@gmail.com> wrote: > > My understanding is - it from storageFraction. Here cached blocks are >> immune to eviction - so both persisted RDDs and broadcast variables sit >> here. Ref >> <https://image.slidesharecdn.com/sparkinternalsworkshoplatest-160303190243/95/apache-spark-in-depth-core-concepts-architecture-internals-20-638.jpg?cb=1457597704> >> >> >> On Wed, Jun 21, 2017 at 1:43 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> >> wrote: >> >>> Hello. >>> >>> Question: Do broadcast variables stored on executors count as part of >>> 'storage memory' or other memory? >>> >>> A little bit more detail: >>> >>> I understand that we have two knobs to control memory allocation: >>> - spark.memory.fraction >>> - spark.memory.storageFraction >>> >>> My understanding is that spark.memory.storageFraction controls the >>> amount of memory allocated for cached RDDs. spark.memory.fraction controls >>> how much memory is allocated to Spark operations (task serialization, >>> operations, etc.), w/ the remainder reserved for user data structures, >>> Spark internal metadata, etc. This includes the storage memory for cached >>> RDDs. >>> >>> You end up with executor memory that looks like the following: >>> All memory: 0-100 >>> Spark memory: 0-75 >>> RDD Storage: 0-37 >>> Other Spark: 38-75 >>> Other Reserved: 76-100 >>> >>> Where do broadcast variables fall into the mix? >>> >>> Regards, >>> >>> Bryan Jeffrey >>> >> >>