> > Actually, threads are a hardware implementation - hence the whole notion > of “multi-threaded cores”.
No, a multi-threaded core is a core that supports multiple concurrent threads of execution, not a core that has multiple threads. The terminology and marketing around multi-core processors, hyper threading and virtualization are confusing enough without taking the further step of misapplying software-specific terms to hardware components. On Thu, Jun 16, 2016 at 7:45 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Thanks all. > > I think we are diverging but IMO it is a worthwhile discussion > > Actually, threads are a hardware implementation - hence the whole notion > of “multi-threaded cores”. What happens is that the cores often have > duplicate registers, etc. for holding execution state. While it is > correct that only a single process is executing at a time, a single core > will have execution states of multiple processes preserved in these > registers. In addition, it is the core (not the OS) that determines when > the thread is executed. The approach often varies according to the CPU > manufacturer, but the most simple approach is when one thread of execution > executes a multi-cycle operation (e.g. a fetch from main memory, etc.), the > core simply stops processing that thread saves the execution state to a set > of registers, loads instructions from the other set of registers and goes > on. On the Oracle SPARC chips, it will actually check the next thread to > see if the reason it was ‘parked’ has completed and if not, skip it for the > subsequent thread. The OS is only aware of what are cores and what are > logical processors - and dispatches accordingly. *Execution is up to the > cores*. . > > Cheers > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 16 June 2016 at 13:02, Robin East <robin.e...@xense.co.uk> wrote: > >> Mich >> >> >> A core may have one or more threads >> It would be more accurate to say that a core could *run* one or more >> threads scheduled for execution. Threads are a software/OS concept that >> represent executable code that is scheduled to run by the OS; A CPU, core >> or virtual core/virtual processor execute that code. Threads are not CPUs >> or cores whether physical or logical - any Spark documentation that implies >> this is mistaken. I’ve looked at the documentation you mention and I don’t >> read it to mean that threads are logical processors. >> >> To go back to your original question, if you set local[6] and you have 12 >> logical processors then you are likely to have half your CPU resources >> unused by Spark. >> >> >> On 15 Jun 2016, at 23:08, Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >> I think it is slightly more than that. >> >> These days software is licensed by core (generally speaking). That is >> the physical processor. * A core may have one or more threads - or >> logical processors*. Virtualization adds some fun to the mix. >> Generally what they present is ‘virtual processors’. What that equates to >> depends on the virtualization layer itself. In some simpler VM’s - it is >> virtual=logical. In others, virtual=logical but they are constrained to >> be from the same cores - e.g. if you get 6 virtual processors, it really is >> 3 full cores with 2 threads each. Rational is due to the way OS >> dispatching works on ‘logical’ processors vs. cores and POSIX threaded >> applications. >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 13 June 2016 at 18:17, Mark Hamstra <m...@clearstorydata.com> wrote: >> >>> I don't know what documentation you were referring to, but this is >>> clearly an erroneous statement: "Threads are virtual cores." At best it is >>> terminology abuse by a hardware manufacturer. Regardless, Spark can't get >>> too concerned about how any particular hardware vendor wants to refer to >>> the specific components of their CPU architecture. For us, a core is a >>> logical execution unit, something on which a thread of execution can run. >>> That can map in different ways to different physical or virtual hardware. >>> >>> On Mon, Jun 13, 2016 at 12:02 AM, Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> It is not the issue of testing anything. I was referring to >>>> documentation that clearly use the term "threads". As I said and showed >>>> before, one line is using the term "thread" and the next one "logical >>>> cores". >>>> >>>> >>>> HTH >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> On 12 June 2016 at 23:57, Daniel Darabos < >>>> daniel.dara...@lynxanalytics.com> wrote: >>>> >>>>> Spark is a software product. In software a "core" is something that a >>>>> process can run on. So it's a "virtual core". (Do not call these >>>>> "threads". >>>>> A "thread" is not something a process can run on.) >>>>> >>>>> local[*] uses java.lang.Runtime.availableProcessors() >>>>> <https://github.com/apache/spark/blob/v1.6.1/core/src/main/scala/org/apache/spark/SparkContext.scala#L2608>. >>>>> Since Java is software, this also returns the number of virtual cores. >>>>> (You >>>>> can test this easily.) >>>>> >>>>> >>>>> On Sun, Jun 12, 2016 at 9:23 PM, Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I was writing some docs on Spark P&T and came across this. >>>>>> >>>>>> It is about the terminology or interpretation of that in Spark doc. >>>>>> >>>>>> This is my understanding of cores and threads. >>>>>> >>>>>> Cores are physical cores. Threads are virtual cores. Cores with 2 >>>>>> threads is called hyper threading technology so 2 threads per core makes >>>>>> the core work on two loads at same time. In other words, every thread >>>>>> takes >>>>>> care of one load. >>>>>> >>>>>> Core has its own memory. So if you have a dual core with hyper >>>>>> threading, the core works with 2 loads each at same time because of the 2 >>>>>> threads per core, but this 2 threads will share memory in that core. >>>>>> >>>>>> Some vendors as I am sure most of you aware charge licensing per core. >>>>>> >>>>>> For example on the same host that I have Spark, I have a SAP product >>>>>> that checks the licensing and shuts the application down if the license >>>>>> does not agree with the cores speced. >>>>>> >>>>>> This is what it says >>>>>> >>>>>> ./cpuinfo >>>>>> License hostid: 00e04c69159a 0050b60fd1e7 >>>>>> Detected 12 logical processor(s), 6 core(s), in 1 chip(s) >>>>>> >>>>>> So here I have 12 logical processors and 6 cores and 1 chip. I call >>>>>> logical processors as threads so I have 12 threads? >>>>>> >>>>>> Now if I go and start worker process >>>>>> ${SPARK_HOME}/sbin/start-slaves.sh, I see this in GUI page >>>>>> >>>>>> <image.png> >>>>>> >>>>>> it says 12 cores but I gather it is threads? >>>>>> >>>>>> Spark document >>>>>> <http://spark.apache.org/docs/latest/submitting-applications.html> >>>>>> states and I quote >>>>>> >>>>>> <image.png> >>>>>> >>>>>> >>>>>> OK the line local[k] adds .. *set this to the number of cores on >>>>>> your machine* >>>>>> >>>>>> But I know that it means threads. Because if I went and set that to >>>>>> 6, it would be only 6 threads as opposed to 12 threads. >>>>>> >>>>>> the next line local[*] seems to indicate it correctly as it refers to >>>>>> "logical cores" that in my understanding it is threads. >>>>>> >>>>>> I trust that I am not nitpicking here! >>>>>> >>>>>> Cheers, >>>>>> >>>>>> >>>>>> Dr Mich Talebzadeh >>>>>> >>>>>> >>>>>> LinkedIn * >>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>> >>>>>> >>>>>> http://talebzadehmich.wordpress.com >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> >