DeepSeek: Standing on the Shoulders of Giants
Advertisements
During the Chinese New Year, a homegrown AI model named DeepSeek captured widespread attention and became the center of conversations around technological advancementsThe holiday season, typically characterized by family gatherings and celebrations, unexpectedly turned into a hotbed for discussions surrounding this innovative development in artificial intelligence.
On January 20, DeepSeek officially launched its reasoning model, DeepSeek-R1, which created a significant stir within the tech communityJust a week later, on January 27, the accompanying application skyrocketed to the top of Apple’s App Store download charts in both China and the United StatesBy January 31, it reached a landmark moment when tech giants Nvidia, Amazon, and Microsoft announced their integration of DeepSeek-R1 on the same day.
The rise and success of DeepSeek reflect a breakthrough for AI technologies in China, transcending traditional boundariesThe fervor surrounding DeepSeek can be attributed to its reach and functionality that many believe have surpassed certain international standardsThe model's significance is underscored by its increasing visibility and utility, entering into the daily rituals of ordinary people and facilitating their work processes.
Demonstrating the prowess of domestically developed AI models, DeepSeek has not only made strides towards closing the gap with global leaders but may even begin to surpass themWhat sets DeepSeek apart further is the remarkable reduction in training costs for AI models; for instance, the training cost for its R1 model was merely $5.576 million— a stark contrast to OpenAI's expense that soared to $100 million.
Many have expressed admiration for the DeepSeek model
Advertisements
In light of the overwhelming positive feedback, Liang Wenfeng, the founder of DeepSeek, humbly stated, "We are merely standing on the shoulders of giants within the open-source community, tightening a few screws to ensure the structure of our homegrown models is sound."
The "giants" Liang refers to encompass the vast open-source community, which allows users to modify and learn from available source codesThis community serves as a bedrock for collaborative innovation that fuels advancements across various sectors, particularly in artificial intelligence.
While promoting a vision of global unity, it's important to acknowledge that the open-source initiative is not without its inherent economic interestsBy tapping into open-source strategies, companies can attract developers and researchers from around the world, creating a vibrant ecosystem based on agility and collaborationThis leads to a positive feedback loop of shared technology and innovative opportunities, giving those on the lower end of the competitive spectrum a fighting chanceIn contrast, competition within the tech industry often manifests in a closed or proprietary ecosystem versus an open-source one.
The open-source movement has played a crucial role in propelling the rapid evolution of AI technologiesYann LeCun, Meta's Chief AI Scientist and Turing Award laureate, remarked that rather than viewing it as a competition where China surpasses the United States in AI, it is more accurate to say that open-source initiatives are overcoming proprietary modelsDeepSeek has been able to leverage benefits from open research and codebases such as PyTorch and LLaMA from Meta.
Indeed, Meta has been one of the foremost drivers of open-source AI models
Advertisements
Its projects span a remarkable array of applications, from the powerful LLaMA model to impressive image segmentation tools like Segment-AnythingLLaMA is recognized as one of the strongest open-source AI models availableIn July 2023, Meta shifted the LLaMA2 open-source protocol from "research only" to "commercial use allowed," leading to a proliferation of models that built upon its frameworkThis has successfully shifted the competitive landscape of large models often dominated by OpenAI, with DeepSeek at the forefront of this shift.
Through extensive experimentation, DeepSeek has provided compelling evidence that open-source models can match the performance of closed-source counterpartsThe ramifications of this realization are substantial, encouraging AI giants like Meta to reassess their open-source strategies and double down on their investments in open AIRecently, Meta has rolled out multiple innovative AI projects, such as the SAM 2.1 image segmentation model, which greatly enhances accuracy and efficiency in identifying distinct elements within imagesAdditionally, their multimodal language model Spirit LM, which combines text, image, and voice data, optimizes AI's abilities in cross-domain understanding and interactionThe self-learning evaluator allows models to autonomously learn and assess themselves, continuously improving performance.
The recent push towards open-source protocols has remarkably accelerated the progression of AI technologiesYet, it is crucial to recognize that the essence of open-source lies in efficient collaboration and accessibilityIn our complex and rapidly evolving international landscape, efficiency is often overshadowed by other pressing concernsThe AI ecosystem is composed of both hardware and softwareCurrently, the hardware aspect is trending towards a more closed model, raising apprehensions about whether software might follow suit and retreat into proprietary norms
Advertisements
Such a transition could stifle the dynamism and innovation that characterizes AI developmentThis necessitates a proactive approach from China's AI sector: bolstering innovation through internal advancements while simultaneously embracing global collaborationThe open-source model adopted by DeepSeek is a prime example of how to contribute to industry progression and pioneer a movement towards open innovation.