Computer Vision / Video Analytics – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-03T23:31:15Z https://developer.nvidia.com/blog/feed/ Elias Wolfberg <![CDATA[AI Model Offers Conservationists New Tools to Protect Fisheries, Wildlife at Scale]]> https://developer.nvidia.com/blog/?p=96671 2025-03-03T23:31:15Z 2025-03-03T17:48:01Z In an effort to rein in illicit fishing, researchers have unveiled a new open-source AI model that can accurately identify what virtually all of the world’s...]]>

In an effort to rein in illicit fishing, researchers have unveiled a new open-source AI model that can accurately identify what virtually all of the world’s seafaring vessels are doing, including whether a boat is potentially fishing illegally. Seattle-based Ai2 (the Allen Institute for AI) recently released a lightweight model named Atlantes to analyze more than five billion GPS signals a…

Source

]]>
Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> https://developer.nvidia.com/blog/?p=96519 2025-02-28T17:13:38Z 2025-02-26T22:05:00Z Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...]]>

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to…

Source

]]>
Shubham Agrawal <![CDATA[Vision Language Model Prompt Engineering Guide for Image and Video Understanding]]> https://developer.nvidia.com/blog/?p=96229 2025-02-26T16:25:37Z 2025-02-26T16:25:34Z Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual...]]>

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual understanding to large language models (LLMs) through the use of a vision encoder. These initial VLMs were limited in their abilities, only able to understand text and single image inputs. Fast-forward a few years and VLMs are now capable of…

Source

]]>
Vishesh Lokras <![CDATA[NVIDIA Video Codec SDK 13.0 Powered by NVIDIA Blackwell]]> https://developer.nvidia.com/blog/?p=96377 2025-02-27T23:18:47Z 2025-02-24T22:55:30Z The release of NVIDIA Video Codec SDK 13.0 marks a significant upgrade, adding support for the latest-generation NVIDIA Blackwell GPUs. This version brings a...]]>

The release of NVIDIA Video Codec SDK 13.0 marks a significant upgrade, adding support for the latest-generation NVIDIA Blackwell GPUs. This version brings a wealth of improvements aimed at elevating both video encoding and decoding capabilities. From enhanced compression efficiency to better throughput and encoding quality, SDK 13.0 addresses the ever-evolving demands of the video ecosystem.

Source

]]>
Ravi Chaudhary <![CDATA[Enabling Stereoscopic and 3D Views Using MV-HEVC in NVIDIA Video Codec SDK 13.0]]> https://developer.nvidia.com/blog/?p=96366 2025-02-24T22:32:37Z 2025-02-24T22:32:34Z NVIDIA announces the implementation of Multi-View High Efficiency Video Coding (MV-HEVC) encoder in the latest NVIDIA Video Codec SDK release, version 13.0....]]>

NVIDIA announces the implementation of Multi-View High Efficiency Video Coding (MV-HEVC) encoder in the latest NVIDIA Video Codec SDK release, version 13.0. This significant update marks a major leap forward in hardware-accelerated, multi-view video compression. It offers enhanced compression efficiency and quality for stereoscopic and 3D video applications as compared to simulcast encoding.

Source

]]>
Michelle Horton <![CDATA[AI for Climate, Energy, and Ecosystem Resilience at NVIDIA GTC 2025]]> https://developer.nvidia.com/blog/?p=95520 2025-02-20T15:50:14Z 2025-02-20T17:44:00Z From mitigating climate change to improving disaster response and environmental monitoring, AI is reshaping how we tackle critical global challenges....]]>

From mitigating climate change to improving disaster response and environmental monitoring, AI is reshaping how we tackle critical global challenges. Advancements in fast, high-resolution climate forecasting, real-time monitoring, and digital twins are equipping scientists, policy-makers, and industry leaders with data-driven tools to understand, plan for, and respond to a warming planet.

Source

]]>
Joanne Chang <![CDATA[Featured Computer Vision and Video Analytics Sessions at NVIDIA GTC 2025]]> https://developer.nvidia.com/blog/?p=96193 2025-02-20T15:50:53Z 2025-02-20T17:00:00Z Explore visually perceptive AI agents, the latest vision AI technologies, hands-on training, and inspiring deployments.]]>

Explore visually perceptive AI agents, the latest vision AI technologies, hands-on training, and inspiring deployments.

Source

]]>
Joanne Chang <![CDATA[Upcoming Webinar: Unlocking Video Analytics With AI Agents]]> https://developer.nvidia.com/blog/?p=96135 2025-02-20T15:52:55Z 2025-02-13T22:05:57Z Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.]]>

Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.

Source

]]>
Pranav Marathe <![CDATA[Just Released: Tripy, a Python Programming Model For TensorRT]]> https://developer.nvidia.com/blog/?p=95947 2025-02-10T17:08:43Z 2025-02-10T17:08:40Z Experience high-performance inference, usability, intuitive APIs, easy debugging with eager mode, clear error messages, and more.]]>

Experience high-performance inference, usability, intuitive APIs, easy debugging with eager mode, clear error messages, and more.

Source

]]>
Brad Nemire <![CDATA[Featured Researcher and Educator Sessions at NVIDIA GTC 2025]]> https://developer.nvidia.com/blog/?p=95817 2025-02-06T19:33:45Z 2025-02-05T23:03:06Z Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.]]>

Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.

Source

]]>
Elias Wolfberg <![CDATA[New AI Model Offers Cellular-Level View of Cancerous Tumors]]> https://developer.nvidia.com/blog/?p=95758 2025-02-06T19:33:48Z 2025-02-04T22:33:00Z Researchers studying cancer unveiled a new AI model that provides cellular-level mapping and visualizations of cancer cells, which scientists hope can shed...]]>

Researchers studying cancer unveiled a new AI model that provides cellular-level mapping and visualizations of cancer cells, which scientists hope can shed light on how—and why—certain inter-cellular relationships triggers cancers to grow. BioTuring, a San Diego-based startup, announced an AI model that can quickly create detailed visualizations of cancerous tumors—at single-cell resolution.

Source

]]>
Michelle Horton <![CDATA[AI Foundation Model Enhances Cancer Diagnosis and Tailors Treatment]]> https://developer.nvidia.com/blog/?p=95722 2025-02-06T19:33:49Z 2025-02-04T17:16:54Z A new study and AI model from researchers at Stanford University is streamlining cancer diagnostics, treatment planning, and prognosis prediction. Named MUSK...]]>

A new study and AI model from researchers at Stanford University is streamlining cancer diagnostics, treatment planning, and prognosis prediction. Named MUSK (Multimodal transformer with Unified maSKed modeling), the research aims to advance precision oncology, tailoring treatment plans to each patient based on their unique medical data. “Multimodal foundation models are a new frontier in…

Source

]]>
1
Michelle Horton <![CDATA[Advancing Rare Disease Detection with AI-Powered Cellular Profiling]]> https://developer.nvidia.com/blog/?p=95498 2025-02-06T19:33:59Z 2025-01-29T20:45:46Z Rare diseases are difficult to diagnose due to limitations in traditional genomic sequencing. Wolfgang Pernice, assistant professor at Columbia University, is...]]>

Rare diseases are difficult to diagnose due to limitations in traditional genomic sequencing. Wolfgang Pernice, assistant professor at Columbia University, is using AI-powered cellular profiling to bridge these gaps and advance personalized medicine. At NVIDIA GTC 2024, Pernice shared insights from his lab’s work with diseases like Charcot-Marie-Tooth (CMT) and mitochondrial disorders.

Source

]]>
Michelle Horton <![CDATA[Spinal Health Diagnostics Gets Deep Learning Automation]]> https://developer.nvidia.com/blog/?p=95243 2025-02-06T19:37:43Z 2025-01-22T17:09:42Z An advanced deep-learning model that automates X-ray analysis for faster and more accurate assessments could transform spinal health diagnostics. Capable of...]]>

An advanced deep-learning model that automates X-ray analysis for faster and more accurate assessments could transform spinal health diagnostics. Capable of handling even complex cases, the research promises to help doctors save time, reduce diagnostic errors, and improve treatment plans for patients with spinal conditions like scoliosis and kyphosis. “Although spinopelvic alignment analysis…

Source

]]>
Elias Wolfberg <![CDATA[AI Uncovers Potentially Hazardous, Forgotten Oil and Gas Wells]]> https://developer.nvidia.com/blog/?p=95106 2025-01-23T19:54:20Z 2025-01-16T19:09:15Z With as many as 800,000 forgotten oil and gas wells scattered across the US, researchers from Lawrence Berkeley National Laboratory (LBNL), have developed an AI...]]>

With as many as 800,000 forgotten oil and gas wells scattered across the US, researchers from Lawrence Berkeley National Laboratory (LBNL), have developed an AI model capable of accurately locating, at scale, wells that may be leaking toxic chemicals and greenhouse gases, like methane, into the environment. The model is designed to identify many of the roughly 3.7M oil and gas wells dug in…

Source

]]>
Samuel Ochoa <![CDATA[Build a Video Search and Summarization Agent with NVIDIA AI Blueprint]]> https://developer.nvidia.com/blog/?p=86011 2025-02-13T20:44:57Z 2025-01-07T04:20:00Z This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications...]]>

This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications and their development workflow are typically built on fixed-function, limited models that are designed to detect and identify only a select set of predefined objects. With generative AI, NVIDIA NIM microservices…

Source

]]>
2
Elias Wolfberg <![CDATA[AI Vision Helps Green Recycling Plants]]> https://developer.nvidia.com/blog/?p=94421 2025-01-07T20:18:07Z 2024-12-19T20:20:23Z Each year, the world recycles only around 13% of its two billion-plus tons of municipal waste. By 2050, the world's annual municipal waste will reach 3.88B...]]>

Each year, the world recycles only around 13% of its two billion-plus tons of municipal waste. By 2050, the world’s annual municipal waste will reach 3.88B tons. But the global recycling industry is far from efficient. Annually, as much as $120B of potentially recoverable plastic—let alone paper or metals—ends up in landfills rather than within new products made with recycled materials.

Source

]]>
Michelle Horton <![CDATA[Time-Lapse AI Model Enhances IVF Embryo Selection]]> https://developer.nvidia.com/blog/?p=93767 2024-12-18T16:38:55Z 2024-12-12T17:29:22Z Researchers from Weill Cornell Medicine have developed an AI-powered model that could help couples undergoing in vitro fertilization (IVF) and guide...]]>

Researchers from Weill Cornell Medicine have developed an AI-powered model that could help couples undergoing in vitro fertilization (IVF) and guide embryologists in selecting healthy embryos for implantation. Recently published in Nature Communications, the study presents the Blastocyst Evaluation Learning Algorithm (BELA). This state-of-the-art deep learning model evaluates embryo quality and…

Source

]]>
Joanne Chang <![CDATA[Just Released: NVIDIA VILA VLM]]> https://developer.nvidia.com/blog/?p=93512 2024-12-12T19:35:17Z 2024-12-09T17:09:10Z Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.]]>

Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.

Source

]]>
Michael Zephyr <![CDATA[Celebrating Open Science and Enterprise AI Innovation on MONAI’s 5th Anniversary]]> https://developer.nvidia.com/blog/?p=92886 2024-12-20T18:35:40Z 2024-12-05T22:13:17Z As MONAI celebrates its fifth anniversary, we're witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. ...]]>

As MONAI celebrates its fifth anniversary, we’re witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. This announcement brings two exciting developments: the release of MONAI Core v1.4, expanding open-source capabilities, and the general availability of VISTA-3D and MAISI as NVIDIA NIM microservices. This dual release reflects our…

Source

]]>
Monika Jhuria <![CDATA[Scaling Action Recognition Models with Synthetic Data]]> https://developer.nvidia.com/blog/?p=91593 2024-12-12T19:35:22Z 2024-12-03T18:36:55Z Action recognition models such as PoseClassificationNet have been around for some time, helping systems identify and classify human actions like walking,...]]>

Action recognition models such as PoseClassificationNet have been around for some time, helping systems identify and classify human actions like walking, waving, or picking up objects. While the concept is well-established, the challenge lies in building a robust computer vision model that can accurately recognize the range of actions across different scenarios that are domain- or use case…

Source

]]>
Shubham Agrawal <![CDATA[Build an Agentic Video Workflow with Video Search and Summarization]]> https://developer.nvidia.com/blog/?p=92834 2025-01-07T05:45:50Z 2024-12-03T18:30:00Z Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system...]]>

Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system that can answer questions about video and image content? This presents a far more complex task. Traditional video analytics tools struggle due to their limited functionality and a narrow focus on predefined objects.

Source

]]>
Joanne Chang <![CDATA[Just Released: NVIDIA DeepStream 7.1]]> https://developer.nvidia.com/blog/?p=92695 2024-12-12T19:46:55Z 2024-11-25T16:40:22Z The new release introduces Python support in Service Maker to accelerate real-time multimedia and AI inference applications with a powerful GStreamer...]]>

The new release introduces Python support in Service Maker to accelerate real-time multimedia and AI inference applications with a powerful GStreamer abstraction layer.

Source

]]>
Shashank Maheshwari <![CDATA[NVIDIA JetPack 6.1 Boosts Performance and Security through Camera Stack Optimizations and Introduction of Firmware TPM]]> https://developer.nvidia.com/blog/?p=91283 2024-12-12T19:47:55Z 2024-11-21T22:01:16Z NVIDIA JetPack has continuously evolved to offer cutting-edge software tailored to the growing needs of edge AI and robotic developers. With each release,...]]>

NVIDIA JetPack has continuously evolved to offer cutting-edge software tailored to the growing needs of edge AI and robotic developers. With each release, JetPack has enhanced its performance, introduced new features, and optimized existing tools to deliver increased value to its users. This means that your existing Jetson Orin-based products experience performance optimizations by upgrading to…

Source

]]>
Michelle Horton <![CDATA[AI Unlocks Early Clues to Alzheimer’s Through Retinal Scans]]> https://developer.nvidia.com/blog/?p=92565 2024-12-12T19:38:44Z 2024-11-21T16:40:39Z Your eyes could hold the key to unlocking early detection of Alzheimer’s and dementia, with a groundbreaking AI study. Called Eye-AD, the deep learning...]]>

Your eyes could hold the key to unlocking early detection of Alzheimer’s and dementia, with a groundbreaking AI study. Called Eye-AD, the deep learning framework analyzes high-resolution retinal images, identifying small changes in vascular layers linked to dementia that are often too subtle for human detection. The approach offers a rapid, non-invasive screening for cognitive decline…

Source

]]>
1
Michelle Horton <![CDATA[Deep Learning AI Model Identifies Breast Cancer Spread without Surgery]]> https://developer.nvidia.com/blog/?p=91133 2024-12-20T18:48:46Z 2024-10-31T16:06:07Z A new deep learning model could reduce the need for surgery when diagnosing whether cancer cells are spreading, including to nearby lymph nodes—also known as...]]>

A new deep learning model could reduce the need for surgery when diagnosing whether cancer cells are spreading, including to nearby lymph nodes—also known as metastasis. Developed by researchers from the University of Texas Southwestern Medical Center, the AI tool analyzes time-series MRIs and clinical data to identify metastasis, providing crucial, noninvasive support for doctors in treatment…

Source

]]>
Elias Wolfberg <![CDATA[AI-Powered Devices Track Howls to Save Wolves]]> https://developer.nvidia.com/blog/?p=91077 2024-10-31T16:21:07Z 2024-10-29T17:56:55Z A new cell-phone-sized device—which can be deployed in vast, remote areas—is using AI to identify and geolocate wildlife to help conservationists track...]]>

A new cell-phone-sized device—which can be deployed in vast, remote areas—is using AI to identify and geolocate wildlife to help conservationists track endangered species, including wolves around Yellowstone National Park. The battery-powered devices—dubbed GrizCams—are designed by a small Montana startup, Grizzly Systems. Together with biologists, they’re deploying a constellation of the…

Source

]]>
Hanson Xu <![CDATA[Federated Learning in Autonomous Vehicles Using Cross-Border Training]]> https://developer.nvidia.com/blog/?p=90443 2025-02-05T20:08:58Z 2024-10-24T16:00:00Z Federated learning is revolutionizing the development of autonomous vehicles (AVs), particularly in cross-country scenarios where diverse data sources and...]]>

Federated learning is revolutionizing the development of autonomous vehicles (AVs), particularly in cross-country scenarios where diverse data sources and conditions are crucial. Unlike traditional machine learning methods that require centralized data storage, federated learning enables AVs to collaboratively train algorithms using locally collected data while keeping the data decentralized.

Source

]]>
Bret Li <![CDATA[Optimizing the CV Pipeline in Automotive Vehicle Development Using the PVA Engine]]> https://developer.nvidia.com/blog/?p=90646 2024-10-31T16:21:21Z 2024-10-23T13:00:00Z In the field of automotive vehicle software development, more large-scale AI models are being integrated into autonomous vehicles. The models range from vision...]]>

In the field of automotive vehicle software development, more large-scale AI models are being integrated into autonomous vehicles. The models range from vision AI models to end-to-end AI models for autonomous driving. Now the demand for computing power is sharply increasing, leading to higher system loads that can have a negative impact on system stability and latency.

Source

]]>
Paul Logan <![CDATA[Accelerating Reality Capture Workflows with AI and NVIDIA RTX GPUs]]> https://developer.nvidia.com/blog/?p=89719 2024-10-17T18:19:11Z 2024-10-07T23:03:48Z Reality capture creates highly accurate, detailed, and immersive digital representations of environments. Innovations in site scanning and accelerated data...]]>

Reality capture creates highly accurate, detailed, and immersive digital representations of environments. Innovations in site scanning and accelerated data processing, and emerging technologies like neural radiance fields (NeRFs) and Gaussian splatting are significantly enhancing the capabilities of reality capture. These technologies are revolutionizing interactions with and analyses of the…

Source

]]>
William Raveane <![CDATA[Optimizing Microsoft Bing Visual Search with NVIDIA Accelerated Libraries]]> https://developer.nvidia.com/blog/?p=89831 2024-11-14T16:23:01Z 2024-10-07T21:11:06Z Microsoft Bing Visual Search enables people around the world to find content using photographs as queries. The heart of this capability is Microsoft's TuringMM...]]>

Microsoft Bing Visual Search enables people around the world to find content using photographs as queries. The heart of this capability is Microsoft’s TuringMM visual embedding model that maps images and text into a shared high-dimensional space. Operating on billions of images across the web, performance is critical. This post details efforts to optimize the TuringMM pipeline using NVIDIA…

Source

]]>
Tanya Lenz <![CDATA[Generate Image and Text Embeddings with NV-CLIP]]> https://developer.nvidia.com/blog/?p=89773 2024-10-17T18:19:13Z 2024-10-07T20:00:00Z NV-CLIP, a cutting-edge multimodal embeddings model for image and text, is now generally available.]]>

NV-CLIP, a cutting-edge multimodal embeddings model for image and text, is now generally available.

Source

]]>
Alexander Ladikos <![CDATA[Real-Time Surgical Guidance by Fusing Multi-Modal Imaging with NVIDIA Holoscan]]> https://developer.nvidia.com/blog/?p=89703 2024-10-17T19:06:57Z 2024-10-07T12:00:00Z Developers in the fields of image-guided surgery and surgical vision face unique challenges in creating systems and applications that can significantly improve...]]>

Developers in the fields of image-guided surgery and surgical vision face unique challenges in creating systems and applications that can significantly improve surgical workflows. One such challenge is efficiently combining multi-modal imaging data, such as preoperative 3D patient images with intra-operative video. This is key to providing surgeons with real-time…

Source

]]>
Elias Wolfberg <![CDATA[AI Chatbot Delivers Multilingual Support to African Farmers]]> https://developer.nvidia.com/blog/?p=89513 2024-10-17T19:07:10Z 2024-09-27T18:10:11Z Some of Africa’s most resource-constrained farmers are gaining access to on-demand, AI-powered advice through a multimodal chatbot that gives detailed...]]>

Some of Africa’s most resource-constrained farmers are gaining access to on-demand, AI-powered advice through a multimodal chatbot that gives detailed recommendations about how to increase yields or fight common pests and crop diseases. Since February, farmers in the East African nation of Malawi have had access to the chatbot, named UlangiziAI, through WhatsApp on mobile phones.

Source

]]>
Michelle Horton <![CDATA[How AI and Robotics are Driving Agricultural Productivity and Sustainability]]> https://developer.nvidia.com/blog/?p=89454 2024-10-17T19:07:15Z 2024-09-25T15:53:36Z By 2030, John Deere aims for fully autonomous farming, addressing global challenges like labor shortages, sustainability, and food security. Their AI and...]]>

By 2030, John Deere aims for fully autonomous farming, addressing global challenges like labor shortages, sustainability, and food security. Their AI and robotics solutions make farming more efficient and profitable, reduce environmental impact, lower carbon footprints, and promote biodiversity. In this session, Chris Padwick, director of Machine Learning and Computer Vision at John Deere…

Source

]]>
Michał Szołucha <![CDATA[Improved Data Loading with Threads]]> https://developer.nvidia.com/blog/?p=88657 2024-09-19T19:30:59Z 2024-09-13T16:00:00Z Data loading is a critical aspect of deep learning workflows, whether you're focused on training or inference. However, it often presents a paradox: the need...]]>

Data loading is a critical aspect of deep learning workflows, whether you’re focused on training or inference. However, it often presents a paradox: the need for a highly convenient solution that is simultaneously customizable. These two goals are notoriously difficult to reconcile. One of the traditional solutions to this problem is to scale out the processing and parallelize the user…

Source

]]>
Ricardo Monteiro <![CDATA[Enabling Customizable GPU-Accelerated Video Transcoding Pipelines]]> https://developer.nvidia.com/blog/?p=88870 2024-09-19T19:31:10Z 2024-09-11T23:01:24Z Today, over 80% of internet traffic is video. This content is generated by and consumed across various devices, including IoT gadgets, smartphones, computers,...]]>

Today, over 80% of internet traffic is video. This content is generated by and consumed across various devices, including IoT gadgets, smartphones, computers, and TVs. As pixel density and the number of connected devices grow, continued investment in fast, efficient, high-quality video encoding and decoding is essential. The latest NVIDIA data center GPUs, such as the NVIDIA L40S and NVIDIA…

Source

]]>
Elias Wolfberg <![CDATA[AI Tool Helps Farmers Combat Crop Loss and Climate Change]]> https://developer.nvidia.com/blog/?p=88957 2025-01-07T20:27:37Z 2024-09-11T16:28:27Z Machine Learning algorithms are beginning to revolutionize modern agriculture. Enabling farmers to combat pests and diseases in real time, the technology is...]]>

Machine Learning algorithms are beginning to revolutionize modern agriculture. Enabling farmers to combat pests and diseases in real time, the technology is improving crop production and profits, while reducing waste, greenhouse gas emissions, and pesticide use. Around 6% of the world’s CO2 emissions come from farming. And every year, up to 40% of crops are lost due to pests and disease.

Source

]]>
Michelle Horton <![CDATA[High-Tech AI Framework Transforms Global Marine Pollution Tracking]]> https://developer.nvidia.com/blog/?p=88586 2024-10-21T16:26:32Z 2024-09-09T15:08:15Z An AI-powered remote sensing study offers a dynamic new tool for global ocean cleanup efforts. Detailed in the ISPRS Journal of Photogrammetry and Remote...]]>

An AI-powered remote sensing study offers a dynamic new tool for global ocean cleanup efforts. Detailed in the ISPRS Journal of Photogrammetry and Remote Sensing, the breakthrough unveils MariNeXt, a deep-learning framework that detects and identifies marine pollution using high-resolution Sentinel-2 imagery. MariNeXt could revolutionize how resource managers and agencies globally monitor and…

Source

]]>
Michelle Horton <![CDATA[AI-Powered Platform Advances Personalized Cancer Diagnostics and Treatments]]> https://developer.nvidia.com/blog/?p=88574 2024-10-21T16:26:32Z 2024-09-05T17:27:27Z A recent study introduced a cutting-edge AI-powered pathology platform that can help doctors diagnose and evaluate lung cancer in patients quickly and...]]>

A recent study introduced a cutting-edge AI-powered pathology platform that can help doctors diagnose and evaluate lung cancer in patients quickly and accurately. Developed by a team of researchers at the University of Cologne’s Faculty of Medicine and University Hospital Cologne, the tool provides fully automated and in-depth analysis of benign and cancerous tissues, for faster and more…

Source

]]>
Dvir Samuel <![CDATA[Fast Inversion for Real-Time Image Editing with Text]]> https://developer.nvidia.com/blog/?p=85619 2024-09-05T17:57:10Z 2024-08-30T16:00:04Z Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. They operate by mapping a random sample from a...]]>

Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. They operate by mapping a random sample from a high-dimensional space, conditioned on a user-provided text prompt, through a series of denoising steps. This results in a representation of the corresponding image, . These models can also be used for more complex tasks such as image…

Source

]]>
Monika Jhuria <![CDATA[New Foundational Models and Training Capabilities with NVIDIA TAO 5.5]]> https://developer.nvidia.com/blog/?p=87263 2024-09-09T19:37:08Z 2024-08-28T16:00:00Z NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune...]]>

NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune them with your own data, and optimize the models for specific use cases without needing deep AI expertise. TAO integrates seamlessly with the NVIDIA hardware and software ecosystem, providing tools for efficient AI model training…

Source

]]>
Shuo Wang <![CDATA[Simplifying Camera Calibration to Enhance AI-Powered Multi-Camera Tracking]]> https://developer.nvidia.com/blog/?p=87901 2024-09-05T17:57:21Z 2024-08-27T18:30:00Z This post is the third in a series on building multi-camera tracking vision AI applications. We introduce the overall end-to-end workflow and fine-tuning...]]>

This post is the third in a series on building multi-camera tracking vision AI applications. We introduce the overall end-to-end workflow and fine-tuning process to enhance system accuracy in the first part and second part. NVIDIA Metropolis is an application framework and set of developer tools that leverages AI for visual data analysis across industries. Its multi-camera tracking reference…

Source

]]>
Joanne Chang <![CDATA[Webinar: Build Visual AI Agents With Generative AI and NVIDIA NIM]]> https://developer.nvidia.com/blog/?p=87551 2024-08-22T18:24:51Z 2024-08-19T15:00:00Z Learn how to build high-performance solutions with NVIDIA visual AI agents that help streamline operations across a range of industries.]]>

Learn how to build high-performance solutions with NVIDIA visual AI agents that help streamline operations across a range of industries.

Source

]]>
Michelle Horton <![CDATA[Interactive AI Tool Delivers Immersive Video Content to Blind and Low-Vision Viewers]]> https://developer.nvidia.com/blog/?p=86936 2025-02-04T19:44:34Z 2024-08-12T15:54:26Z New research aims to revolutionize video accessibility for blind or low-vision (BLV) viewers with an AI-powered system that gives users the ability to explore...]]>

New research aims to revolutionize video accessibility for blind or low-vision (BLV) viewers with an AI-powered system that gives users the ability to explore content interactively. The innovative system, detailed in a recent paper, addresses significant gaps in conventional audio descriptions (AD), offering an enriched and immersive video viewing experience. “Although videos have become an…

Source

]]>
Michelle Horton <![CDATA[​​Real-Time AI Shark Detection is Boosting Beach Safety]]> https://developer.nvidia.com/blog/?p=86892 2024-09-05T18:59:24Z 2024-08-06T19:01:54Z California beaches are becoming safer with a new AI-powered shark detection system. Known as SharkEye, the technology identifies sharks near shorelines in real...]]>

California beaches are becoming safer with a new AI-powered shark detection system. Known as SharkEye, the technology identifies sharks near shorelines in real time and sends text alerts to public safety officials, lifeguards, and the community. This innovative AI-driven system, developed by the Benioff Ocean Science Laboratory (BOSL) at the University of California, Santa Barbara…

Source

]]>
Ahmed Harouni <![CDATA[Computed Tomography Organ and Disease Segmentation Using the NVIDIA VISTA-3D NIM Microservice]]> https://developer.nvidia.com/blog/?p=85863 2024-08-08T18:48:29Z 2024-07-26T18:41:23Z Over 300M computed tomography (CT) scans are performed globally, 85M in the US alone. Radiologists are looking for ways to speed up their workflow and generate...]]>

Over 300M computed tomography (CT) scans are performed globally, 85M in the US alone. Radiologists are looking for ways to speed up their workflow and generate accurate reports, so having a foundation model to segment all organs and diseases would be helpful. Ideally, you’d have an optimized way to run this model in production at scale. NVIDIA Research has created a new foundation model to…

Source

]]>
Samuel Ochoa <![CDATA[Develop Generative AI-Powered Visual AI Agents for the Edge]]> https://developer.nvidia.com/blog/?p=85444 2024-11-07T05:08:55Z 2024-07-17T15:00:00Z An exciting breakthrough in AI technology—Vision Language Models (VLMs)—offers a more dynamic and flexible method for video analysis. VLMs enable users to...]]>

An exciting breakthrough in AI technology—Vision Language Models (VLMs)—offers a more dynamic and flexible method for video analysis. VLMs enable users to interact with image and video input using natural language, making the technology more accessible and adaptable. These models can run on the NVIDIA Jetson Orin edge AI platform or discrete GPUs through NIMs. This blog post explores how to build…

Source

]]>
1
Sameer Satish Pusegaonkar <![CDATA[Enhance Multi-Camera Tracking Accuracy by Fine-Tuning AI Models with Synthetic Data]]> https://developer.nvidia.com/blog/?p=84692 2024-07-25T18:19:09Z 2024-07-10T16:00:00Z Large-scale, use–case-specific synthetic data has become increasingly important in real-world computer vision and AI workflows. That’s because digital twins...]]>

Large-scale, use–case-specific synthetic data has become increasingly important in real-world computer vision and AI workflows. That’s because digital twins are a powerful way to create physics-based virtual replicas of factories, retail spaces, and other assets, enabling precise simulations of real-world environments. NVIDIA Isaac Sim, built on NVIDIA Omniverse, is a fully extensible…

Source

]]>
2
Min-Hung Chen https://minhungchen.netlify.app/ <![CDATA[Introducing DoRA, a High-Performing Alternative to LoRA for Fine-Tuning]]> https://developer.nvidia.com/blog/?p=84454 2024-11-07T05:09:12Z 2024-06-28T15:00:00Z Full fine-tuning (FT) is commonly employed to tailor general pretrained models for specific downstream tasks. To reduce the training cost, parameter-efficient...]]>

Full fine-tuning (FT) is commonly employed to tailor general pretrained models for specific downstream tasks. To reduce the training cost, parameter-efficient fine-tuning (PEFT) methods have been introduced to fine-tune pretrained models with a minimal number of parameters. Among these, Low-Rank Adaptation (LoRA) and its variants have gained considerable popularity because they avoid additional…

Source

]]>
Abhijit Patait <![CDATA[Improving Video Quality with the NVIDIA Video Codec SDK 12.2 for HEVC]]> https://developer.nvidia.com/blog/?p=82930 2024-09-22T15:09:03Z 2024-06-26T19:30:00Z NVIDIA Video Codec SDK provides a comprehensive set of APIs for hardware-accelerated video encode and decode on Windows and Linux. The 12.2 release improves...]]>

NVIDIA Video Codec SDK provides a comprehensive set of APIs for hardware-accelerated video encode and decode on Windows and Linux. The 12.2 release improves video quality for high-efficiency video coding (HEVC). It offers a significant reduction in bit rates, particularly for natural video content. This post details the following new features: The lookahead level can help analyze…

Source

]]>
Nate Bradford <![CDATA[Transforming Microsoft XLS and PPT Files into a Factory Digital Twin with OpenUSD]]> https://developer.nvidia.com/blog/?p=84422 2024-07-10T15:28:34Z 2024-06-26T16:00:00Z SyncTwin GmbH, a company that builds software to optimize production, intralogistics, and assembly, is on a mission to unlock industrial digital twins for small...]]>

SyncTwin GmbH, a company that builds software to optimize production, intralogistics, and assembly, is on a mission to unlock industrial digital twins for small and medium-sized businesses (SMBs). While SyncTwin has helped major global companies like BMW minimize costs and downtime in their factories with digital twins, they are now shifting their focus to enable manufacturing businesses…

Source

]]>
Elias Wolfberg <![CDATA[AI-Enhanced Navigation Charts Safer Waters for Massive Ships]]> https://developer.nvidia.com/blog/?p=84076 2025-02-04T19:49:56Z 2024-06-25T16:00:00Z Maritime startup Orca AI is pioneering safety at sea with its AI-powered navigation system, which provides real-time video processing to help crews make...]]>

Maritime startup Orca AI is pioneering safety at sea with its AI-powered navigation system, which provides real-time video processing to help crews make data-driven decisions in congested waters and low-visibility conditions. Every year, thousands of massive 100-million-pound vessels, ferrying $14T worth of goods, cross the world’s oceans and waterways, fighting to keep to tight deadlines.

Source

]]>
Pengfei Guo <![CDATA[Addressing Medical Imaging Limitations with Synthetic Data Generation]]> https://developer.nvidia.com/blog/?p=83468 2025-02-04T19:51:06Z 2024-06-24T17:50:59Z Synthetic data in medical imaging offers numerous benefits, including the ability to augment datasets with diverse and realistic images where real data is...]]>

Synthetic data in medical imaging offers numerous benefits, including the ability to augment datasets with diverse and realistic images where real data is limited. This reduces the costs and labor associated with annotating real images. Synthetic data also provides an ethical alternative to using sensitive patient data, which helps with education and training without compromising patient privacy.

Source

]]>
Monika Jhuria <![CDATA[Real-Time Vision AI From Digital Twins to Cloud-Native Deployment with NVIDIA Metropolis Microservices and NVIDIA Isaac Sim]]> https://developer.nvidia.com/blog/?p=83470 2024-07-30T22:15:36Z 2024-06-24T17:00:00Z As vision AI complexity increases, streamlined deployment solutions are crucial to optimizing spaces and processes. NVIDIA accelerates development, turning...]]>

As vision AI complexity increases, streamlined deployment solutions are crucial to optimizing spaces and processes. NVIDIA accelerates development, turning ideas into reality in weeks rather than months with NVIDIA Metropolis AI workflows and microservices. In this post, we explore Metropolis microservices features: Managing and automating infrastructure with AI is…

Source

]]>
Alvin Clark <![CDATA[Generate Traffic Insights Using YOLOv8 and NVIDIA JetPack 6.0]]> https://developer.nvidia.com/blog/?p=84266 2024-06-27T18:17:55Z 2024-06-18T19:53:22Z Intelligent Transportation Systems (ITS) applications are becoming increasingly valuable and prevalent in modern urban environments. The benefits of using ITS...]]>

Intelligent Transportation Systems (ITS) applications are becoming increasingly valuable and prevalent in modern urban environments. The benefits of using ITS applications include: Importantly, these systems need to process information at the edge for reliable bandwidth, privacy, real-time analytics, and more. This post explains how to use the new Jetson Platform Services from…

Source

]]>
Akhil Docca <![CDATA[Supercharge Robotics Workflows with AI and Simulation Using NVIDIA Isaac Sim 4.0 and NVIDIA Isaac Lab]]> https://developer.nvidia.com/blog/?p=84120 2024-06-27T18:32:20Z 2024-06-17T13:00:00Z The era of AI robots powered by physical AI has arrived. Physical AI models understand their environments and autonomously complete complex tasks in the...]]>

The era of AI robots powered by physical AI has arrived. Physical AI models understand their environments and autonomously complete complex tasks in the physical world. Many of the complex tasks—like dexterous manipulation and humanoid locomotion across rough terrain—are too difficult to program and rely on generative physical AI models trained using reinforcement learning (RL) in simulation.

Source

]]>
Joanne Chang <![CDATA[MediaTek Integrates NVIDIA TAO Toolkit for IoT Edge AI Development]]> https://developer.nvidia.com/blog/?p=83754 2024-06-13T19:08:55Z 2024-06-06T18:16:31Z MediaTek is teaming with NVIDIA to integrate NVIDIA TAO training and pretrained models into its development workflow, bringing advanced AI and visual perception...]]>

MediaTek is teaming with NVIDIA to integrate NVIDIA TAO training and pretrained models into its development workflow, bringing advanced AI and visual perception to billions of IoT edge devices.

Source

]]>
Meiran Peng <![CDATA[Build a Zero-Copy AI Sensor Processing Pipeline with OpenCV in NVIDIA Holoscan SDK]]> https://developer.nvidia.com/blog/?p=83116 2024-06-13T19:06:01Z 2024-06-05T14:00:00Z NVIDIA Holoscan is the NVIDIA domain-agnostic multimodal real-time AI sensor processing platform that delivers the foundation for developers to build their...]]>

NVIDIA Holoscan is the NVIDIA domain-agnostic multimodal real-time AI sensor processing platform that delivers the foundation for developers to build their end-to-end sensor processing pipeline. NVIDIA Holoscan SDK features include: Holoscan SDK can be used to build streaming AI pipelines for a range of industries and use cases, including medical devices, high-performance computing at…

Source

]]>
Chintan Shah <![CDATA[Power Cloud-Native Microservices at the Edge with NVIDIA JetPack 6.0, Now GA]]> https://developer.nvidia.com/blog/?p=83182 2024-11-07T05:09:27Z 2024-06-04T20:24:32Z NVIDIA JetPack SDK powers NVIDIA Jetson modules, offering a comprehensive solution for building end-to-end accelerated AI applications. JetPack 6 expands the...]]>

NVIDIA JetPack SDK powers NVIDIA Jetson modules, offering a comprehensive solution for building end-to-end accelerated AI applications. JetPack 6 expands the Jetson platform’s flexibility and scalability with microservices and a host of new features. It’s the most downloaded version of JetPack in 2024. With the JetPack 6.0 production release now generally available…

Source

]]>
Monika Jhuria <![CDATA[Optimize Processes for Large Spaces with the Multi-Camera Tracking Workflow]]> https://developer.nvidia.com/blog/?p=83108 2024-07-17T17:00:28Z 2024-06-02T12:30:00Z This post is the first in a series on building multi-camera tracking vision AI applications. In this part, we introduce the overall end-to-end workflow,...]]>

This post is the first in a series on building multi-camera tracking vision AI applications. In this part, we introduce the overall end-to-end workflow, focusing on building and deploying the multi-camera tracking system. The second part will cover fine-tuning AI models with synthetic data to enhance system accuracy. Large areas like warehouses, factories, stadiums, and airports are typically…

Source

]]>
Jenny Plunkett <![CDATA[How to Train an Object Detection Model for Visual Inspection with Synthetic Data]]> https://developer.nvidia.com/blog/?p=70820 2024-06-17T16:44:02Z 2024-05-31T22:30:00Z AI is rapidly changing industrial visual inspection. In a factory setting, visual inspection is used for many issues, including detecting defects and missing or...]]>

AI is rapidly changing industrial visual inspection. In a factory setting, visual inspection is used for many issues, including detecting defects and missing or incorrect parts during assembly. Computer vision can help identify problems with products early on, reducing the chances of them being delivered to customers. However, developing accurate and versatile object detection models remains…

Source

]]>
0
Amr Elmeleegy <![CDATA[Enhancing the Apparel Shopping Experience with AI, Emoji-Aware OCR, and Snapchat’s Screenshop]]> https://developer.nvidia.com/blog/?p=82250 2024-05-30T19:55:51Z 2024-05-17T17:33:20Z Ever spotted someone in a photo wearing a cool shirt or some unique apparel and wondered where they got it? How much did it cost? Maybe you've even thought...]]>

Ever spotted someone in a photo wearing a cool shirt or some unique apparel and wondered where they got it? How much did it cost? Maybe you’ve even thought about buying one for yourself. This challenge inspired Snap’s ML engineering team to introduce Screenshop, a service within Snapchat’s app that uses AI to locate and recommend fashion items online that match the style seen in an image.

Source

]]>
Carlos Garcia-Sierra <![CDATA[NVIDIA DeepStream 7.0 Milestone Release for Next-Gen Vision AI Development]]> https://developer.nvidia.com/blog/?p=82050 2024-09-04T22:00:17Z 2024-05-14T22:50:34Z NVIDIA DeepStream is a powerful SDK that unlocks GPU-accelerated building blocks to build end-to-end vision AI pipelines. With more than 40+ plugins available...]]>

NVIDIA DeepStream is a powerful SDK that unlocks GPU-accelerated building blocks to build end-to-end vision AI pipelines. With more than 40+ plugins available off-the-shelf, you can deploy fully optimized pipelines with cutting-edge AI Inference, object tracking, and seamless integration with popular IoT message brokers such as REDIS, Kafka, and MQTT. DeepStream offers intuitive REST APIs to…

Source

]]>
Paul Shin <![CDATA[Mitigating Occlusions in Visual Perception Using Single-View 3D Tracking in NVIDIA DeepStream]]> https://developer.nvidia.com/blog/?p=81786 2024-05-15T17:15:51Z 2024-05-08T16:00:00Z When it comes to perception for Intelligent Video Analytics (IVA) applications such as traffic monitoring, warehouse safety, and retail shopper analytics, one...]]>

When it comes to perception for Intelligent Video Analytics (IVA) applications such as traffic monitoring, warehouse safety, and retail shopper analytics, one of the biggest challenges is occlusions. People may move behind structural obstacles, retail shoppers may not be fully visible due to shelving units, and cars may be hidden behind large trucks, for example. This post explains how the…

Source

]]>
Yao (Jason) Lu <![CDATA[Visual Language Intelligence and Edge AI 2.0 with NVIDIA Cosmos Nemotron]]> https://developer.nvidia.com/blog/?p=81534 2025-01-09T03:29:25Z 2024-05-03T15:00:00Z Note: As of January 6, 2025, VILA is now part of the Cosmos Nemotron VLM family. NVIDIA is proud to announce the release of NVIDIA Cosmos Nemotron, a family of...]]>

Note: As of January 6, 2025, VILA is now part of the Cosmos Nemotron VLM family. NVIDIA is proud to announce the release of NVIDIA Cosmos Nemotron, a family of state-of-the-art vision language models (VLMs) designed to query and summarize images and videos from physical or virtual environments. Cosmos Nemotron builds upon NVIDIA’s groundbreaking visual understanding research including VILA…

Source

]]>
1
Yao (Jason) Lu <![CDATA[Visual Language Models on NVIDIA Hardware with VILA]]> https://developer.nvidia.com/blog/?p=81571 2025-01-07T04:01:29Z 2024-05-03T15:00:00Z Note: As of January 6, 2025 VILA is now part of the new Cosmos Nemotron vision language models. Visual language models have evolved significantly recently....]]>

Note: As of January 6, 2025 VILA is now part of the new Cosmos Nemotron vision language models. Visual language models have evolved significantly recently. However, the existing technology typically only supports one single image. They cannot reason among multiple images, support in context learning or understand videos. Also, they don’t optimize for inference speed. We developed VILA…

Source

]]>
1
Tian Cao <![CDATA[Perception Model Training for Autonomous Vehicles with Tensor Parallelism]]> https://developer.nvidia.com/blog/?p=81464 2024-05-02T19:01:07Z 2024-04-27T05:00:00Z Due to the adoption of multicamera inputs and deep convolutional backbone networks, the GPU memory footprint for training autonomous driving perception models...]]>

Due to the adoption of multicamera inputs and deep convolutional backbone networks, the GPU memory footprint for training autonomous driving perception models is large. Existing methods for reducing memory usage often result in additional computational overheads or imbalanced workloads. This post describes joint research between NVIDIA and NIO, a developer of smart electric vehicles.

Source

]]>
Vishwesh Nath <![CDATA[Advancing Cell Segmentation and Morphology Analysis with NVIDIA AI Foundation Model VISTA-2D]]> https://developer.nvidia.com/blog/?p=81250 2024-05-07T16:54:01Z 2024-04-22T18:30:00Z Genomics researchers use different sequencing techniques to better understand biological systems, including single-cell and spatial omics. Unlike single-cell,...]]>

Genomics researchers use different sequencing techniques to better understand biological systems, including single-cell and spatial omics. Unlike single-cell, which looks at data at the cellular level, spatial omics considers where that data is located and takes into account the spatial context for analysis. As genomics researchers look to model biological systems across multiple omics at…

Source

]]>
Mahesh Khadatare <![CDATA[Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec]]> https://developer.nvidia.com/blog/?p=81155 2024-04-18T20:14:59Z 2024-04-17T20:30:00Z This post delves into the capabilities of decoding DICOM medical images within AWS HealthImaging using the nvJPEG2000 library. We'll guide you through the...]]>

This post delves into the capabilities of decoding DICOM medical images within AWS HealthImaging using the nvJPEG2000 library. We’ll guide you through the intricacies of image decoding, introduce you to AWS HealthImaging, and explore the advancements enabled by GPU-accelerated decoding solutions. Embarking on a journey to enhance throughput and reduce costs in deciphering medical images…

Source

]]>
Tiffany Yeung <![CDATA[Explainer: What Is a Convolutional Neural Network?]]> https://developer.nvidia.com/blog/?p=75991 2024-06-05T22:20:53Z 2024-04-12T19:00:00Z A convolutional neural network is a type of deep learning network used primarily to identify and classify images and to recognize objects within images.]]>

A convolutional neural network is a type of deep learning network used primarily to identify and classify images and to recognize objects within images.

Source

]]>
0
Michelle Horton <![CDATA[Explainer: What Is Computer Vision?]]> https://developer.nvidia.com/blog/?p=75988 2024-06-05T22:19:45Z 2024-03-22T19:00:00Z Computer vision defines the field that enables devices to acquire, process,  understand, and analyze digital images and videos and extract useful...]]>

Computer vision defines the field that enables devices to acquire, process, understand, and analyze digital images and videos and extract useful information.

Source

]]>
Mostafa Toloui <![CDATA[Developing Production-Ready AI Sensor Processing Applications with NVIDIA Holoscan 1.0]]> https://developer.nvidia.com/blog/?p=79788 2024-04-09T23:45:13Z 2024-03-20T17:00:00Z Edge AI developers are building AI applications and products for safety-critical and regulated use cases. With NVIDIA Holoscan 1.0, these applications can...]]>

Edge AI developers are building AI applications and products for safety-critical and regulated use cases. With NVIDIA Holoscan 1.0, these applications can incorporate real-time insights and processing in milliseconds. With the recent release of NVIDIA Holoscan 1.0, developers can more easily build production-ready applications for multimodal, real-time sensor processing.

Source

]]>
Michael Zephyr <![CDATA[Breaking Barriers in Healthcare with New Models for Generative AI and Cellular Imaging]]> https://developer.nvidia.com/blog/?p=79523 2024-04-09T23:45:17Z 2024-03-19T15:00:00Z Driving the future of healthcare imaging, NVIDIA MONAI microservices are creating unique state-of-the-art models and expanded modalities to meet the demands of...]]>

Driving the future of healthcare imaging, NVIDIA MONAI microservices are creating unique state-of-the-art models and expanded modalities to meet the demands of the healthcare and biopharma industry. The latest update introduces a suite of new features designed to further enhance the capabilities and efficiency of medical imaging workflows. This post explores the following new features…

Source

]]>
Cem Moluluo <![CDATA[Calculating Video Quality Using NVIDIA GPUs and VMAF-CUDA]]> https://developer.nvidia.com/blog/?p=77541 2024-04-09T23:45:26Z 2024-03-12T16:57:38Z Video quality metrics are used to evaluate the fidelity of video content. They provide a consistent quantitative measurement to assess the performance of the...]]>

Video quality metrics are used to evaluate the fidelity of video content. They provide a consistent quantitative measurement to assess the performance of the encoder. VMAF combines human vision modeling with machine learning techniques that are continuously evolving, enabling it to adapt to new content. VMAF excels in aligning with human visual perception by combining detailed analysis…

Source

]]>
Paul Springer <![CDATA[cuTENSOR 2.0: Applications and Performance]]> https://developer.nvidia.com/blog/?p=77915 2024-04-09T23:45:28Z 2024-03-09T03:20:47Z While part 1 focused on the usage of the new NVIDIA cuTENSOR 2.0 CUDA math library, this post introduces a variety of usage modes beyond that, specifically...]]>

While part 1 focused on the usage of the new NVIDIA cuTENSOR 2.0 CUDA math library, this post introduces a variety of usage modes beyond that, specifically usage from Python and Julia. We also demonstrate the performance of cuTENSOR based on benchmarks in a number of application domains. This post explores applications and performance benchmarks for cuTENSOR 2.0. For more information…

Source

]]>
Paul Springer <![CDATA[cuTENSOR 2.0: A Comprehensive Guide for Accelerating Tensor Computations]]> https://developer.nvidia.com/blog/?p=77913 2024-04-09T23:45:29Z 2024-03-09T03:20:45Z NVIDIA cuTENSOR is a CUDA math library that provides optimized implementations of tensor operations where tensors are dense, multi-dimensional arrays or array...]]>

NVIDIA cuTENSOR is a CUDA math library that provides optimized implementations of tensor operations where tensors are dense, multi-dimensional arrays or array slices. The release of cuTENSOR 2.0 represents a major update—in both functionality and performance—over its predecessor. This version reimagines its APIs to be more expressive, including advanced just-in-time compilation capabilities all…

Source

]]>
Amr Elmeleegy <![CDATA[Generate Stunning Images with Stable Diffusion XL on the NVIDIA AI Inference Platform]]> https://developer.nvidia.com/blog/?p=78388 2024-05-07T16:51:08Z 2024-03-07T19:05:46Z Diffusion models are transforming creative workflows across industries. These models generate stunning images based on simple text or image inputs by...]]>

Diffusion models are transforming creative workflows across industries. These models generate stunning images based on simple text or image inputs by iteratively shaping random noise into AI-generated art through denoising diffusion techniques. This can be applied to many enterprise use cases such as creating personalized content for marketing, generating imaginative backgrounds for objects in…

Source

]]>
1
Tanya Lenz <![CDATA[Featured Smart Spaces Sessions at NVIDIA GTC 2024]]> https://developer.nvidia.com/blog/?p=78162 2024-04-09T23:45:34Z 2024-03-07T00:19:10Z From cities and airports to Olympic Stadiums, AI is transforming public spaces into safer, smarter, and more sustainable environments.]]>

From cities and airports to Olympic Stadiums, AI is transforming public spaces into safer, smarter, and more sustainable environments.

Source

]]>
Jeffrey Renfro <![CDATA[Spotlight: Honeywell Accelerates Industrial Process Simulation with NVIDIA cuDSS]]> https://developer.nvidia.com/blog/?p=78496 2024-04-09T23:45:36Z 2024-03-05T19:00:00Z For over a decade, traditional industrial process modeling and simulation approaches have struggled to fully leverage multicore CPUs or acceleration devices to...]]>

For over a decade, traditional industrial process modeling and simulation approaches have struggled to fully leverage multicore CPUs or acceleration devices to run simulation and optimization calculations in parallel. Multicore linear solvers used in process modeling and simulation have not achieved expected improvements, and in certain cases have underperformed optimized single-core solvers.

Source

]]>
Nate Bradford <![CDATA[Top Synthetic Data Generation Sessions at NVIDIA GTC 2024]]> https://developer.nvidia.com/blog/?p=78671 2024-03-07T19:18:48Z 2024-02-29T23:31:18Z Learn how synthetic data is supercharging 3D simulation and computer vision workflows, from visual inspection to autonomous machines.]]>

Learn how synthetic data is supercharging 3D simulation and computer vision workflows, from visual inspection to autonomous machines.

Source

]]>
0
Umair Iqbal <![CDATA[Detecting Real-Time Waste Contamination Using Edge Computing and Video Analytics]]> https://developer.nvidia.com/blog/?p=76482 2024-03-07T19:33:06Z 2024-02-26T21:00:00Z The past few decades have witnessed a surge in rates of waste generation, closely linked to economic development and urbanization. This escalation in waste...]]>

The past few decades have witnessed a surge in rates of waste generation, closely linked to economic development and urbanization. This escalation in waste production poses substantial challenges for governments worldwide in terms of efficient processing and management. Despite the implementation of waste classification systems in developed countries, a significant portion of waste still ends up…

Source

]]>
0
Michelle Horton <![CDATA[Top Computer Vision/Video Analytics Sessions at NVIDIA GTC 2024]]> https://developer.nvidia.com/blog/?p=78104 2024-02-22T19:58:48Z 2024-02-21T22:00:00Z Discover the transformative power of computer vision and video analytics at GTC. Dive into cutting-edge techniques such as vision transformers, AI agents,...]]>

Discover the transformative power of computer vision and video analytics at GTC. Dive into cutting-edge techniques such as vision transformers, AI agents, multi-modal foundation models, 3D technology, large language models (LLMs), vision language models (VLMs), generative AI, and more.

Source

]]>
0
Michelle Horton <![CDATA[Webinar: Accelerate Edge AI Development With NVIDIA Metropolis Microservices For Jetson]]> https://developer.nvidia.com/blog/?p=78100 2024-02-22T19:58:51Z 2024-02-21T17:30:00Z On March 5, 8am PT, learn how NVIDIA Metropolis microservices for Jetson Orin helps you modernize your app stack, streamline development and deployment, and...]]>

On March 5, 8am PT, learn how NVIDIA Metropolis microservices for Jetson Orin helps you modernize your app stack, streamline development and deployment, and future-proof your apps with the ability to bring the latest generative AI capabilities to any customer through simple API calls.

Source

]]>
0
Gal Chechik <![CDATA[Generative AI Research Spotlight: Personalizing Text-to-Image Models]]> https://developer.nvidia.com/blog/?p=77308 2024-02-22T19:59:02Z 2024-02-06T23:41:01Z Visual generative AI is the process of creating images from text prompts. The technology is based on vision-language foundation models that are pretrained on...]]>

Visual generative AI is the process of creating images from text prompts. The technology is based on vision-language foundation models that are pretrained on web-scale data. These foundation models are used in many applications by providing a multimodal representation. Examples include image captioning and video retrieval, creative 3D and 2D image synthesis, and robotic manipulation.

Source

]]>
0
John Yang <![CDATA[Emulating the Attention Mechanism in Transformer Models with a Fully Convolutional Network]]> https://developer.nvidia.com/blog/?p=75844 2024-02-08T18:51:54Z 2024-01-29T17:00:00Z The past decade has seen a remarkable surge in the adoption of deep learning techniques for computer vision (CV) tasks. Convolutional neural networks (CNNs)...]]>

The past decade has seen a remarkable surge in the adoption of deep learning techniques for computer vision (CV) tasks. Convolutional neural networks (CNNs) have been the cornerstone of this revolution, exhibiting exceptional performance and enabling significant advancements in visual perception. By employing localized filters and hierarchical architectures, CNNs have proven adept at…

Source

]]>
0
Chintan Shah <![CDATA[Announcing NVIDIA Metropolis Microservices for Jetson for Rapid Edge AI Development]]> https://developer.nvidia.com/blog/?p=76670 2024-06-17T16:38:04Z 2024-01-25T18:30:00Z NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. Building vision AI...]]>

NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. Building vision AI applications for the edge often comes with notoriously long and costly development cycles. At the same time, quickly developing edge AI applications that are cloud-native, flexible, and secure has never been more important. Now…

Source

]]>
1
Riccardo Mariani <![CDATA[Using the Power of AI to Make Factories Safer]]> https://developer.nvidia.com/blog/?p=77101 2024-02-08T19:52:28Z 2024-01-24T17:00:00Z As industrial automation increases, safety becomes a greater challenge and top priority for enterprises.  Safety encompasses multiple aspects:  System...]]>

As industrial automation increases, safety becomes a greater challenge and top priority for enterprises. Safety encompasses multiple aspects: The same technological solution that’s driving automation can be used to also address safety: artificial intelligence. AI-powered stationary outside-in safety platforms, which monitor activity across many distributed machines or robots…

Source

]]>
0
Samuel Ochoa <![CDATA[Bringing Generative AI to the Edge with NVIDIA Metropolis Microservices for Jetson]]> https://developer.nvidia.com/blog/?p=76663 2024-06-17T16:36:14Z 2024-01-23T17:00:00Z NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. NVIDIA Metropolis...]]>

NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. NVIDIA Metropolis Microservices for Jetson provides a suite of easy-to-deploy services that enable you to quickly build production-quality vision AI applications while using the latest AI approaches. This post explains how to develop and deploy generative AI…

Source

]]>
0
Bhanu Pisupati <![CDATA[Build Vision AI Applications at the Edge with NVIDIA Metropolis Microservices and APIs]]> https://developer.nvidia.com/blog/?p=76684 2024-06-17T16:37:03Z 2024-01-23T17:00:00Z NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. NVIDIA Metropolis...]]>

NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. NVIDIA Metropolis microservices provide powerful, customizable, cloud-native APIs and microservices to develop vision AI applications and solutions. The framework now includes NVIDIA Jetson, enabling developers to quickly build and productize performant and…

Source

]]>
0
Raffaello Bonghi <![CDATA[Benchmarking Camera Performance on Your Workstation with NVIDIA Isaac Sim]]> https://developer.nvidia.com/blog/?p=76929 2024-10-23T21:12:50Z 2024-01-22T15:00:00Z Robots are typically equipped with cameras. When designing a digital twin simulation, it’s important to replicate its performance in a simulated environment...]]>

Robots are typically equipped with cameras. When designing a digital twin simulation, it’s important to replicate its performance in a simulated environment accurately. However, to make sure the simulation runs smoothly, it’s crucial to check the performance of the workstation that is running the simulation. In this blog post, we explore the steps to setting up and running a camera benchmark…

Source

]]>
1
Asawaree Bhide <![CDATA[Generate Synthetic Data for Deep Object Pose Estimation Training with NVIDIA Isaac ROS]]> https://developer.nvidia.com/blog/?p=75640 2024-08-06T20:51:29Z 2024-01-18T21:45:18Z For robotic agents to interact with objects in their environment, they must know the position and orientation of objects around them. This information describes...]]>

For robotic agents to interact with objects in their environment, they must know the position and orientation of objects around them. This information describes the six degrees of freedom (DOF) pose of a rigid body in 3D space, detailing the translational and rotational state. Accurate pose estimation is necessary to determine how to orient a robotic arm to grasp or place objects in a…

Source

]]>
0
Rishi Puri <![CDATA[Release: PyTorch Geometric Container for GNNs on NGC]]> https://developer.nvidia.com/blog/?p=76597 2024-06-06T16:17:50Z 2024-01-17T23:05:40Z The NVIDIA PyG container, now generally available, packages PyTorch Geometric with accelerations for GNN models, dataloading, and pre-processing using...]]>

The NVIDIA PyG container, now generally available, packages PyTorch Geometric with accelerations for GNN models, dataloading, and pre-processing using cuGraph-Ops, cuGraph, and cuDF from NVIDIA RAPIDS, all with an effortless out-of-the-box experience.

Source

]]>
0
Marc-Michael Horstmann <![CDATA[Simulating Railroads with OpenUSD]]> https://developer.nvidia.com/blog/?p=76567 2024-02-08T18:52:03Z 2024-01-17T21:00:00Z Railroad simulation is important in modern transportation and logistics, providing a virtual testing ground for the intricate interplay of tracks, switches, and...]]>

Railroad simulation is important in modern transportation and logistics, providing a virtual testing ground for the intricate interplay of tracks, switches, and rolling stock. It serves as a crucial tool for engineers and developers to fine-tune and optimize railway systems, ensuring efficiency, safety, and cost-effectiveness. Physically realistic simulations enable comprehensive scenario…

Source

]]>
0
Vishal Chavan <![CDATA[Robust Scene Text Detection and Recognition: Inference Optimization]]> https://developer.nvidia.com/blog/?p=74321 2024-11-14T15:43:46Z 2024-01-16T17:02:00Z In this post, we delve deeper into the inference optimization process to improve the performance and efficiency of our machine learning models during the...]]>

In this post, we delve deeper into the inference optimization process to improve the performance and efficiency of our machine learning models during the inference stage. We discuss the techniques employed, such as inference computation graph simplification, quantization, and lowering precision. We also showcase the benchmarking results of our scene text detection and recognition models…

Source

]]>
3
Vishal Chavan <![CDATA[Robust Scene Text Detection and Recognition: Implementation]]> https://developer.nvidia.com/blog/?p=74323 2024-11-14T15:44:08Z 2024-01-16T17:01:00Z To make scene text detection and recognition work on irregular text or for specific use cases, you must have full control of your model so that you can do...]]>

To make scene text detection and recognition work on irregular text or for specific use cases, you must have full control of your model so that you can do incremental learning or fine-tuning as per your use cases and datasets. Keep in mind that this pipeline is the main building block of scene understanding, AI-based inspection, and document processing platforms. It should be accurate and have low…

Source

]]>
0
Vishal Chavan <![CDATA[Robust Scene Text Detection and Recognition: Introduction]]> https://developer.nvidia.com/blog/?p=74322 2024-11-14T15:45:25Z 2024-01-16T17:00:00Z Identification and recognition of text from natural scenes and images become important for use cases like video caption text recognition, detecting signboards...]]>

Identification and recognition of text from natural scenes and images become important for use cases like video caption text recognition, detecting signboards from vehicle-mounted cameras, information retrieval, scene understanding, vehicle number plate recognition, and recognizing text on products. Most of these use cases require near real-time performance. The common technique for text…

Source

]]>
0
Ricardo Monteiro <![CDATA[Video Encoding at 8K60 with Split-Frame Encoding and NVIDIA Ada Lovelace Architecture]]> https://developer.nvidia.com/blog/?p=75226 2024-01-22T22:08:04Z 2024-01-05T19:00:00Z Capturing video footage and playing games at 8K resolution with 60 frames per second (FPS) is now possible, thanks to advances in camera and display...]]>

Capturing video footage and playing games at 8K resolution with 60 frames per second (FPS) is now possible, thanks to advances in camera and display technologies. Major leading multimedia companies including RED Digital Cinema, Nikon, and Canon have already introduced 8K60 cameras for both the consumer and professional markets. On the display side, with the newest HDMI 2.1 standard…

Source

]]>
0
Michelle Horton <![CDATA[New Release: NVIDIA TAO 5.2]]> https://developer.nvidia.com/blog/?p=75832 2023-12-20T21:39:11Z 2023-12-20T19:03:54Z With the latest NVIDIA TAO 5.2, you can now run zero-shot inference for panoptic segmentation with ODISE, create custom 3D object pose models, and boost...]]>

With the latest NVIDIA TAO 5.2, you can now run zero-shot inference for panoptic segmentation with ODISE, create custom 3D object pose models, and boost inference throughput for vision transformers using FasterViT. Download now.

Source

]]>
0
Michelle Horton <![CDATA[Most Popular NVIDIA Technical Blog Posts of 2023: Generative AI, LLMs, Robotics, and Virtual Worlds Breakthroughs]]> https://developer.nvidia.com/blog/?p=74885 2024-12-12T18:18:56Z 2023-12-19T17:50:21Z As we approach the end of another exciting year at NVIDIA, it's time to look back at the most popular stories from the NVIDIA Technical Blog in 2023....]]>

As we approach the end of another exciting year at NVIDIA, it’s time to look back at the most popular stories from the NVIDIA Technical Blog in 2023. Groundbreaking research and developments in fields such as generative AI, large language models (LLMs), high-performance computing (HPC), and robotics are leading the way in transformative AI solutions and capturing the interest of our readers.

Source

]]>
0