Understanding Linux /proc/cpuinfo

A hyperthreaded processor has the same number of function units as an older, non-hyperthreaded processor. It just has two execution contexts, so it can maybe achieve better function unit utilization by letting more than one program execute concurrently. On the other hand, if you're running two programs which compete for the same function units, there is no advantage at all to having both running "concurrently." When one is running, the other is necessarily waiting on the same function units.

A dual core processor literally has two times as many function units as a single-core processor, and can really run two programs concurrently, with no competition for function units.

A dual core processor is built so that both cores share the same level 2 cache. A dual processor (separate physical cpus) system differs in that each cpu will have its own level 2 cache. This may sound like an advantage, and in some situations it can be but in many cases new research and testing shows that the shared cache can be faster when the cpus are sharing the same or very similar tasks.

In general Hyperthreading is considered older technology and is no longer supported in newer cpus. Hyperthreading can provide a marginal (10%) for some server workloads like mysql, but dual core technology has essentially replaced hyperthreading in newer systems.

A dual core cpu running at 3.0Ghz should be faster then a dual cpu (separate core) system running at 3.0Ghz due to the ability to share the cache at higher bus speeds.

The examples below details how we determine what kind of cpu(s) are present.

The kernel data Linux exposes in /proc/cpuinfo will show each logical cpu with a unique processor number. A logical cpu can be a hyperthreading sibling, a shared core in a dual or quad core, or a separate physical cpu. We must look at the siblings, cpu cores and core id to tell the difference.

If the number of cores = the number of siblings for a given physical processor, then hyperthreading is OFF.

/bin/cat /proc/cpuinfo | /bin/egrep 'processor|model name|cache size|core|sibling|physical'

Example 1: Single processor, 1 core, no Hyperthreading

processor	: 0
model name	: AMD Duron(tm) processor
cache size	: 64 KB

Example 2: Single processor, 1 core, Hyperthreading is enabled.

Notice how we have 2 siblings, but only 1 core. The physical cpu id is the same for both: 0.

processor	: 0
model name	: Intel(R) Pentium(R) 4 CPU 2.80GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 1
model name	: Intel(R) Pentium(R) 4 CPU 2.80GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1

Example 3. Single socket Quad Core

Notice how each processor has its own core id. The number of siblings matches the number of cores so there are no Hyperthreading siblings. Also notice the huge l2 cache - 6 MB. That makes sense though, when considering 4 cores share that l2 cache.

processor	: 0
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
processor	: 1
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
processor	: 2
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
processor	: 3
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4

Example 3a. Single socket Dual Core

Again, each processor has its own core so this is a dual core system.

processor	: 0
model name	: Intel(R) Pentium(R) D CPU 3.00GHz
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 1
model name	: Intel(R) Pentium(R) D CPU 3.00GHz
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2

Example 4. Dual Single core CPU, Hyperthreading ENABLED

This example shows that processer 0 and 2 share the same physical cpu and 1 and 3 share the same physical cpu. The number of siblings is twice the number of cores, which is another clue that this is a system with hyperthreading enabled.

processor	: 0
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 1
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 2
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 3
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1

Example 5. Dual CPU Dual Core No hyperthreading

Of the 5 examples this should be the most capable system processor-wise. There are a total of 4 cores; 2 cores in 2 separate socketed physical cpus. Each core shares the 4MB cache with its sibling core. The higher clock rate (3.0 Ghz vs 2.3Ghz) should offer slightly better performance than example 3.

processor	: 0
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 1
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
processor	: 2
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 3
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 1
cpu cores	: 2

Comments

Duplicate entries

Thanks for the article.

Could you shed light on why i would have duplicate entries?
Both are Processor 0

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU L5420 @ 2.50GHz
stepping : 10
cpu MHz : 2493.774
cache size : 6144 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good pni ssse3 cx16 sse4_1 lahf_lm
bogomips : 4989.74
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU L5420 @ 2.50GHz
stepping : 10
cpu MHz : 2493.774
cache size : 6144 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good pni ssse3 cx16 sse4_1 lahf_lm
bogomips : 4989.74
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

Core IDs are not linear in count

RichWeb, excellent information. I do have an odd one for you, and I'd appreciate your to interpretation.

I see two CPUs (Physical IDs: 0 and 3) each with two cores (0 and 6, 1 and 7)and each core has two processors (Siblings 2 on each core), no hyperthreading (since Sibling count = core count on each CPU).

Is this how you read it?

What I find odd is how the system assigns Core IDs: 0, 6, 1, 7, and not 0, 1, 2, 3. This is a Linux server running as a VM on a DL380 G5 with 2.6.9-55.ELsmp GNU/Linux.

Thanks for your insights,

Chuck

$cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Intel(R) Xeon(R) CPU L5240 @ 3.00GHz
stepping : 6
cpu MHz : 3000.011
cache size : 6144 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm pni monitor ds_cpl est tm2 cx16 xtpr
bogomips : 6004.01
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Intel(R) Xeon(R) CPU L5240 @ 3.00GHz
stepping : 6
cpu MHz : 3000.011
cache size : 6144 KB
physical id : 3
siblings : 2
core id : 6
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm pni monitor ds_cpl est tm2 cx16 xtpr
bogomips : 6000.05
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Intel(R) Xeon(R) CPU L5240 @ 3.00GHz
stepping : 6
cpu MHz : 3000.011
cache size : 6144 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm pni monitor ds_cpl est tm2 cx16 xtpr
bogomips : 5999.97
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Intel(R) Xeon(R) CPU L5240 @ 3.00GHz
stepping : 6
cpu MHz : 3000.011
cache size : 6144 KB
physical id : 3
siblings : 2
core id : 7
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm pni monitor ds_cpl est tm2 cx16 xtpr
bogomips : 6000.02
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

Up to a point

"there is no advantage at all to having both running "concurrently." When one is running, the other is necessarily waiting on the same function units."

That is only literally true if both contexts are repeatedly executing the same instruction.

As long as there is a difference between the instructions being executed in each context,there will never be a continuous ping pong between units, and hence some benefit, especially when there is a mix of e.g. integer and floating instructions.

Also, how many caches do the HT units share - if they are running different instances of the same computation then the cache hit ratio could be good.

MikeW

Hyperthreading is alive and well

Your explanation of hyperthreading suggests that multiple cores are a superior replacement for hyperthreading. Not so--they're not exclusive. At this time all the latest desktop and server processors, except AMD's, have both multiple cores and multiple threads per core.

Can you enhance this discusion for cloud computing + other arch?

I am looking at counting chips(physical sockets), cores(per chip), threads(smt per core) on multiple systems using x86, x64, ia64 and PowerPC.

My Itanium questions:
Does the itanium entry for "cpu number" correspond to "physical id"?
Why is siblings non inclusive here and inclusive on x86?

processor : 0
vendor : GenuineIntel
arch : IA-64
family : Itanium 2
model : 1
revision : 5
archrev : 0
features : branchlong
cpu number : 0
cpu regs : 4
cpu MHz : 1400.000000
itc MHz : 1400.000000
BogoMIPS : 2097.15
siblings : 1

processor : 1
vendor : GenuineIntel
arch : IA-64
family : Itanium 2
model : 1
revision : 5
archrev : 0
features : branchlong
cpu number : 0
cpu regs : 4
cpu MHz : 1400.000000
itc MHz : 1400.000000
BogoMIPS : 2092.95
siblings : 1

PowerPC in a LPAR.
processor : 0
cpu : POWER5 (gs)
clock : 1498MHz
revision : 3.1

processor : 1
cpu : POWER5 (gs)
clock : 1498MHz
revision : 3.1

timebase : 187545000
machine : CHRP IBM,9116-561

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 3
model name : Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping : 4
cpu MHz : 2793.182
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pni monitor ds_cpl cid xtpr
bogomips : 6986.90

x64 VMWare
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 3
model name : Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping : 4
cpu MHz : 2793.182
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc up pni monitor ds_cpl cid xtpr
bogomips : 6986.90

I am confused by all the possiblities.