great reply everyone.
just confining to the current subject matter Spark and the use of CPU
allocation. We have Spark-submit parameters:
Local mode
${SPARK_HOME}/bin/spark-submit \
--num-executors 1 \
--master local[2] \ ## two cores
And that --master[k] on
Agreed it’s a worthwhile discussion (and interesting IMO)
This is a section from your original post:
> It is about the terminology or interpretation of that in Spark doc.
>
> This is my understanding of cores and threads.
>
> Cores are physical cores. Threads are virtual cores.
At least as
I mean only that hardware-level threads and the processor's scheduling of
those threads is only one segment of the total space of threads and thread
scheduling, and that saying things like cores have threads or only the core
schedules threads can be more confusing than helpful.
On Thu, Jun 16,
Well LOL
Given a set of parameters one can argue from any angle.
It is not obvious what you are trying to sate here? "It is not strictly
true" yeah OK
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>
> In addition, it is the core (not the OS) that determines when the thread
> is executed.
That's also not strictly true. "Thread" is a concept that can exist at
multiple levels -- even concurrently at multiple levels for a single
running program. Different entities will be responsible for
>
> Actually, threads are a hardware implementation - hence the whole notion
> of “multi-threaded cores”.
No, a multi-threaded core is a core that supports multiple concurrent
threads of execution, not a core that has multiple threads. The
terminology and marketing around multi-core processors,
Just wondering, if threads were purely an hardware implementation then if
my application in Java had one thread, and it was ran on a multcore machine
then that thread in Java could be split up into small parts and ran in
different cores simultaneously. However this would raise synchronization
Thanks all.
I think we are diverging but IMO it is a worthwhile discussion
Actually, threads are a hardware implementation - hence the whole notion of
“multi-threaded cores”. What happens is that the cores often have
duplicate registers, etc. for holding execution state. While it is
correct
Mich
>> A core may have one or more threads
It would be more accurate to say that a core could run one or more threads
scheduled for execution. Threads are a software/OS concept that represent
executable code that is scheduled to run by the OS; A CPU, core or virtual
core/virtual processor
I think it is slightly more than that.
These days software is licensed by core (generally speaking). That is
the physical processor. * A core may have one or more threads - or
logical processors*. Virtualization adds some fun to the mix. Generally
what they present is ‘virtual processors’.
I don't know what documentation you were referring to, but this is clearly
an erroneous statement: "Threads are virtual cores." At best it is
terminology abuse by a hardware manufacturer. Regardless, Spark can't get
too concerned about how any particular hardware vendor wants to refer to
the
Spark is a software product. In software a "core" is something that a
process can run on. So it's a "virtual core". (Do not call these "threads".
A "thread" is not something a process can run on.)
local[*] uses java.lang.Runtime.availableProcessors()
Hi,
I was writing some docs on Spark P and came across this.
It is about the terminology or interpretation of that in Spark doc.
This is my understanding of cores and threads.
Cores are physical cores. Threads are virtual cores. Cores with 2 threads
is called hyper threading technology so 2
13 matches
Mail list logo