High-End AI Server Demand from North America’s Top Four CSPs Expected to Exceed 60% in 2024, Says TrendForce
TrendForce’s newest projections spotlight a 2024 landscape where demand for high-end AI servers—powered by NVIDIA, AMD, or other top-tier ASIC chips—will be heavily influenced by North America’s cloud service powerhouses. Microsoft (20.2%), Google (16.6%), AWS (16%), and Meta (10.8%) are predicted to collectively command over 60% of global demand, with NVIDIA GPU-based servers leading the charge.
NVIDIA faces ongoing hurdles in development as it contends with US restrictions
Despite NVIDIA’s stronghold in the data center sector—thanks to its GPU servers capturing up to 70% of the AI market—challenges continue to loom. Three major challenges are set to limit the company’s future growth: Firstly, the US ban on technological exports has spurred China toward self-reliance in AI chips, with Huawei emerging as a noteworthy adversary. NVIDIA’s China-specific solutions, like the H20 series, might not match the cost-effectiveness of its flagship models, potentially dampening its market dominance.
Secondly, the trend toward proprietary ASIC development among US cloud behemoths, including Google, AWS, Microsoft, and Meta, is expanding annually due to scale and cost considerations. Lastly, AMD presents competitive pressure with its cost-effective strategy, offering products at just 60–70% of the prices of comparable NVIDIA models. This allows AMD to penetrate the market more aggressively, especially with flagship clients. Microsoft is expected to be the most enthusiastic adopter of AMD’s high-end GPU MI300 solutions in 2024.
NVIDIA aims to maintain leadership by diversifying its product portfolio
NVIDIA is actively shaping its future by strategically evolving its product portfolio, moving away from the A100 series toward the more advanced and costly H100 series starting in 2024. This shift will be complemented by the introduction of the H200 series, boasting superior HBM3e specifications, with limited shipments beginning in the second quarter of this year.
To balance cost and performance, NVIDIA intends to implement aggressive pricing for the H200, aligning it closely with the H100’s initial price point to appeal to CSP clients. Additionally, NVIDIA is set to broaden its market reach by entering into negotiations with major players like Meta, Google, AWS, and OpenAI using the NRE model, targeting expansion into telecommunications, automotive, and gaming industries.
NVIDIA plans to unveil its next-gen B100 products in late 2024, which are anticipated to surpass the H series in efficiency. These will feature a 35–40% higher HBM memory capacity than the H200, catering to the demands of HPC or speeding up LLM AI training. The L40S, aimed at enterprise customers, is designed for smaller-scale AI model training or inference at the edge. Simultaneously, the L4 is set to replace the current T4, targeting cloud or edge AI inference applications and broadening options for mid-range and entry-level segments.
NVIDIA is tackling the 2023 GPU supply shortage head-on by playing a key role in increasing CoWoS and HBM production capacities. This proactive approach is expected to cut the current average delivery time of 40 weeks in half by the second quarter, as new capacities start to come online. This expansion aims to alleviate the supply chain bottlenecks that have hindered AI server availability due to GPU shortages.