Hi, I could not find useful formula or documentation which will help me to decide the broadcast join data size depends on the cluster size.
Please let me know is there thumb rule available to find. For example cluster size - 20 node cluster, 32 gb per node and 8 core per node. executor-memory = 8gb, executor-core=4 Memory: 8gb(0.4% per internal) - 4.8gb for actual computation and storage. lets consider i have not done any persist in this case i could utilize 4.8gb per executor. IS IT POSSIBLE FOR ME TO USE 400MB file for BROADCAST JOIN? -- Selvam Raman "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"