Designing On-chip Memory Systems for Throughput Architectures
Jeffrey Robert Diamond
Like GPUs, these modern chips share finite on-chip resources between threads. This results in novel performance and optimization issues at any granularity of parallelism, from cell phones to GPUs.