In the current CPU architecture, visibility is mainly achieved by adding barriers when reading and writing variables decorated with "volatile".
invalidQueue, storeBuffer, and visibility
At the lower level, CPUs mainly achieve cache coherence between multi-core CPUs through the MESI protocol (specific details can be found in relevant content). In MESI, in order to achieve efficient processing of CPUs, two data structures, invalidQueue and storeBuffer, are introduced. First of all, in the CPU, only variables (shared variables) that are occupied by multiple CPUs (CPU0, CPU1) will have visibility issues. Assuming CPU0 and CPU1 both occupy variable x in their caches at a certain moment, when CPU0 starts to modify x, CPU0 will notify CPU1 to invalidate the corresponding cache line. Specifically, in order to achieve efficient processing, CPU0 stores the modified value in the storeBuffer and starts to notify CPU1, while CPU1 puts the corresponding invalidation message in the invalidQueue and then informs CPU0 that the processing is completed. After receiving the response, CPU0 will flush the corresponding modified value from the storeBuffer into the cache. The above is part of the MESI protocol that ensures cache coherence between multi-core CPUs. At this point, we can see that there is actually a problem because CPU0 does not flush the modified value into the main memory, and CPU1 does not perform the corresponding invalidation operation (if the invalidation operation is not performed, CPU1 will still obtain the value in the cache, and only after invalidation will it obtain the corresponding value from the main memory). Therefore, the "volatile" modifier is used to flush the modified value into the main memory when CPU0 flushes it into the cache, and when CPU1 reads x, it completes all the messages in the invalidQueue, thereby ensuring visibility between multi-core CPUs.