Skip to main content

Posts

Showing posts with the label AI training data

Runway Trained AI Video Generator By Scanning Youtube

Exploring the Depths of AI Video Training   In the realm of artificial intelligence, particularly in video generation, the techniques and sources used to train models are of paramount importance. Recently, a revealing investigation by 404 Media exposed the practices of Runway, an AI video generator, and its methods for training its latest Gen-3 model. According to the report, Runway harnessed a vast array of YouTube videos, many from popular photography channels, to develop and refine its video generation capabilities. The Source of Runway’s Training Data 404 Media's investigation uncovered a spreadsheet, sourced from a former Runway employee, detailing the extensive collection of YouTube channels, videos, and even pirated content that were utilized in training the Gen-3 model, codenamed Jupiter. This spreadsheet showcased a concerted company effort to curate high-quality videos, with notable inclusions such as Kai Wong, Peter McKinnon, and Michael Shainblum. The meticulous nature ...

Big Tech Launches Campaign To Defend AI Use

How "Generate And Create" Highlights AI's Role In Modern Art   The Chamber of Progress, a tech industry coalition comprising giants like Amazon, Apple, and Meta, has unveiled a new campaign titled "Generate and Create." This initiative aims to defend the legality of using copyrighted works to train artificial intelligence (AI) systems, focusing on how AI can enhance creative processes and lower the barriers for producing art. The Campaign's Vision and Goals "Generate and Create" was launched to showcase the positive impact of generative AI on the art world. The campaign emphasizes two primary points: the enhancement of creative output by artists using AI and the reduction of barriers to art production. By spotlighting these benefits, the Chamber of Progress hopes to defend the longstanding legal principle of fair use under copyright law. Fair use allows the use of copyrighted material to create new, transformative works. This principle is crucial f...

What is AI? A Non-Technical Guide To Understanding Artificial Intelligence

WTF is AI?   Artificial Intelligence (AI) often conjures images of futuristic robots and self-aware machines. However, AI is more about software that approximates human thinking. It's not the same as human intelligence, but it can mimic certain cognitive functions, making it useful for various applications. This guide will help you understand how AI works, its potential pitfalls, and its current capabilities. How AI Works At its core, AI, often synonymous with machine learning, involves creating algorithms that can detect patterns and make predictions. AI models don't "know" anything in the human sense; they excel at identifying and continuing patterns. A notable metaphor by computational linguists Emily Bender and Alexander Koller likens AI to a "hyper-intelligent deep-sea octopus," detecting patterns in communication without understanding the language itself. Large Language Models (LLMs), like those behind ChatGPT, function similarly. They map out language...

Why Do AI Models Have "Favorite" Numbers? Exploring The Human-Like Quirks Of AI

Artificial Intelligence Continues to Surprise with Human-Like Behavior   AI models are always surprising us, not just in what they can do, but also in what they can’t, and why. An interesting new behavior is both superficial and revealing about these systems: they pick random numbers as if they’re human beings. The Human Struggle with Randomness First, what does it mean for AI to pick a number "randomly"? Humans are notoriously bad at generating random sequences. When asked to predict heads or tails for 100 coin flips, human sequences often lack the randomness of real coin flips. We tend to avoid sequences that seem "too orderly," like six or seven heads in a row, even though such sequences are statistically probable. Similarly, when asked to pick a number between 0 and 100, people rarely choose 1 or 100. Numbers like 66 and 99 are avoided, and there’s a preference for numbers ending in 7 or those in the middle range. AI Models and Their "Favorite" Numbers...

OpenAI Again Refuses to Say if It Used Your Content to Train Sora

Amidst growing scrutiny, OpenAI remains tight-lipped on the sources of its AI training data, leaving more questions than answers. In a world increasingly dominated by artificial intelligence, the transparency of AI training data has become a contentious topic. OpenAI, a leader in the AI space known for innovations like DALL-E and ChatGPT, recently introduced its new AI video generator, Sora. However, the excitement was quickly overshadowed by concerns regarding the origins of the training data used in its development. The Controversy Surrounding Sora's Training Data The debate intensified after Mira Murati, OpenAI’s CTO, evaded questions during a March interview with the Wall Street Journal. Murati's vague assertion that Sora was trained on "publicly available" data did little to quell suspicions about the potential use of unethically sourced data. This incident is not isolated, as AI companies often face scrutiny over the ethical sourcing and transparency of their tr...