Reports come weeks after US blocked Nvidia from selling two high-end AI chips and a top gaming chip to Chinese firms
Nvidia is planning to release three new chips for China, according to local media reports, weeks after the US blocked it from selling two high-end artificial intelligence (AI) chips and one of its top gaming chips to Chinese firms.
Nvidia could announce the chips – the HGX H20, L20 PCIe and L2 PCIe – as soon as 16 November, the Star Market Daily news outlet reported, citing people familiar with the matter
OpenAI launched a slew of new APIs during its first-ever developer day. DALL-E 3, OpenAI’s text-to-image model, is now available via an API after first coming to ChatGPT and Bing Chat. Similar to the previous version of DALL-E, the API incorporates built-in moderation to help protect against misuse, OpenAI says.
Enabling LLM-powered agents to cooperate
Fine-tuning GPT3.5-turbo based on 140k slack messages Ross Lazerowitz spent $83.20 creating a fine-tuned GPT-3.5 turbo model based on 140,000 of his Slack messages (10,399,747 tokens), massaged into a JSONL file suitable for use with the OpenAI fine-tuning API.
In the midst of an AI chip shortage, Microsoft wants to give a privileged few startups free access to “supercomputing” resources from its Azure cloud for developing AI models. Microsoft today announced it’s updating its startup program, Microsoft for Startups Founders Hub, to include a no-cost Azure AI infrastructure option for “high-end,” Nvidia-based GPU virtual […]
Efficient Transformer Inference with Statically Structured Sparse Attention Self-attention matrices of Transformers are often highly sparse because the relevant context of each token is typically limited to just a few other tokens in the sequence. Compared to a dense baseline, we achieve 56.6% reduction in energy consumption, 58.9% performance improvement with <1% accuracy loss and 2.6% area overhead.
sdai Tue, 11/07/2023 - 19:19