Main Page

From 4LeftFeet -- The DDR Robot Project

Jump to: navigation, search
This "Robot" has absolutely nothing to do with the 4LeftFeet project.  But it looks really cool.
This "Robot" has absolutely nothing to do with the 4LeftFeet project. But it looks really cool.

Welcome to the 4LeftFeet Wiki.

4LeftFeet -- The DDR Bot.

A project by Matt Durand
Contact:

Contents

Purpose

4LeftFeet is a stand-alone "robot" that utilizes artificial intelligence to play Dance Dance Revolution (DDR). 4LeftFeet will be able to play single player, and potentially multiplayer DDR, including against opponents online. The performance of 4LeftFeet is embodied by the kinetic display of the robot in action, as well as its interaction with gamers online via XBox Live from an XBox 360 game console.

Why

Coming from a QA / hacker background, something that has always interested me is automation. I find it challenging and exciting to find ways to exploit technology and software to accomplish things they were not originally intended to do, particularly when it comes to saving time with repetitive tasks, or performing complex operations that would otherwise be impossible for a human behind a keyboard and mouse (or other human interface device).

When it comes to DDR, the bare essense of the game is performing a repetitive task on a human interface device. Of course, it involves other elements including coordination, rhythm, timing, practice, and sweat. When I considered the time, patience, and effort that it takes for an individual to become "good" at DDR, I conceived the challenge to build a device that could replace the human element of the game, perform "well" at playing the game, and perhaps even perform in a manner that would be impossible for humans!

Technical

Sample DDR Gameplay:

Explanation of gameplay from Wikipedia.

4LeftFeet is embodied by two major components:

  • Video capture and image recognition
  • Signal output and actuation

Since 4LeftFeet will be playing DDR on an XBox or XBox 360 game console, a camera is not required to acquire the video signal to have a computer to process the image. Instead, I will utilize the console's S-Video output, and digitize this signal. Video will be digitized by connecting the XBox's S-Video output to my Sony digital Camcorder, and using its IEEE-1394 FireWire connection to my PC. This will send a WDM digital video stream to my video processor with minimal latency.

Once the signal is being routed to the computer (Windows XP box), the image will need to be processed. The purpose of the image processing is to identify "arrows" as they are involved in game-play, and calculate the exact timing that an arrow on the DDR game pad needs to be "stepped-on". I have chosen to use EyesWeb, a stand-alone video processing application, to analyse the video in real-time.

There are several obstacles that will make image recognition of the arrow difficult.

  • Moving, animated backgrounds
  • Variation between high and low-contrast between arrows and background
  • Fluctuating color of arrows
  • Varying speed (determined by song tempo) that notes rise on the screen
  • Potentially changing (or stopping) tempo mid-song
  • Combo multiplier is displayed on top of the arrows
  • "Accuracy" messages ("Marvelous!!", "Good", "Miss", etc), are displayed on top of the arrows
DDR Receptors, as they appear at the top of the gameplay GUI
DDR Receptors, as they appear at the top of the gameplay GUI

These problems should be able to be avoided by simply ignoring all information on the screen other than the "receptors" (guide arrows at the top). The receptors are the only part of the game-play GUI that is totally unobstructed by status messages, and remains constant throughout. The video processor should identify when an arrow enters the receptor, and trigger an event at the precise moment that the receptor is totally filled by the arrow.

Two basic arrow types found in DDR. A "freeze" arrow is on the left, a single regular arrow on the right
Two basic arrow types found in DDR. A "freeze" arrow is on the left, a single regular arrow on the right
Additionally, the video recognition process should be able to determine whether the arrow is a "single arrow" or a "freeze arrow" (where the user-input is held for the duration of the arrow). This can be determined by monitoring the receptor "flash" as the note is pressed on the gamepad. A "single arrow" produces an instantaneous flash. A "held arrow" produces a long, constant flash. In either case, the end of the flash indicates that the note should be released. However, despite the problems this solution avoids, it is possible that it may be difficult to identify multiple arrows that are very close together (in some cases they are overlapping each other) on more difficult songs / settings. If this proves to be problematic nearing the end of the project, the scope of 4LeftFeet may be reduced to only play only on the less-difficult settings where the arrows do not overlap.


Basic sketch of the 4LeftFeet hardware component, consisting of a wooden platform, an array of four actuators, and its interface with the PC
Basic sketch of the 4LeftFeet hardware component, consisting of a wooden platform, an array of four actuators, and its interface with the PC

Once the "arrow event" has been processed and triggered by the video processor, this signal will then be processed into an electrical signal, and trigger an actuator which will press against the DDR game pad. Amongst other possible solutions, I am currently considering the use of servos as the initial output from my EyesWeb patch. There exists a Phidget plugin for EyesWeb to directly control a 4-servo array. I intend to build an array of 4 pneumatic actuators (one per gamepad direction) assembled onto a platform made of wood. If I am controling servos with EyesWeb, I will need to find a way to interface the servos to trigger each corresponding actuator. Aesthetically, I would like to affix a shoe to each actuator (left shoes only, of course). I might even paint the wooden platform. I might choose the color pink for this platform. I like pink.

I am going to attempt to consult with individuals who have experience with pneumatics who might be able to give me some pointers to assemble this system as quickly and efficiently as possible.





.

Progress

Week 1

04/03/2007

  • Class Introduction
  • Purchased parking permit for semester: $40
  • Discussed project ideas with Amy
  • Began research and feasibility study on tools suggested by Amy
  • "Acquired" tools and reference manuals required for project completion (totally legit)
  • Performed research on DDR game interfaces to identify challenges
  • Created "sample sketch" of 4LeftFeet Bot

This is where it all began. I had previous conceived the idea that "it would be cool" to develop a robot that could play DDR. This idea was fueled mainly through my desire to set impossible-to-obtain high-scores on difficult songs.

It wasn't until re-enrolling at UCSD and deciding to finish my Senior Project class that I decided this might actually make a cool pseudo-artistic project, and that I would actually dedicate myself to seeing it through until the end. Today is literally the first day I have spent conceiving technical details of accomplishing this.

04/04/2007

The Original 4LeftFeet Logo
The Original 4LeftFeet Logo
  • Created 4LeftFeet WebPage
  • Designed 4LeftFeet Logo
  • Continued research and feasibility software on tools
  • Registered 4LeftFeet.com domain name: $5 private registration fee + $5.99/year after first year
  • Sent e-mail to SPAL USA regarding technical specifications for their 2-wire actuator
  • Digitized sample DDR game session using StepMania Open Source platform
  • Downloaded and began review of EyesWeb Compendium

I am just now beginning to evaluate the EyesWeb stand-alone software. To make this as controlled, and easy as possible for the time being, I am electing to use the StepMania open-source DDR platform, running on my original XBox. Since StepMania is open-source, I have total control of many of the characteristics of the GUI, as well as the gameplay, songs, and stepfiles. I plan to remove all background variables for this test, so there is only a static, black background with no animations. I will also configure it to play a very basic song, with a slow BPM, and approximately one step every 1-2 seconds.

I will record this test song using my MiniDV video camera, and save it to my PC's HDD as an MPEG video file. This way I will have total control over the environment, and eliminate many variables that I don't want to deal with at this early stage in the project.

I have done some research on electric solenoids and actuators today. Commercial-grade actuators are very expensive, and unfortunately do not have the adequate response I am looking for. It appears that performance increases relative to the price of the actuator. On average, it seems a $200 electric linear actuator will be able to deliver 50-75 lb/ft pressure, with 1-2" stroke, and a speed of 0.5-3 seconds per inch. This is much to slow for my application, where I need to be precise +/- 10ms. Plus, this is a lot of money that I don't really want to spend. I have some experience with small, cheap electric solenoid actuators used for automotive applications (locking and unlocking doors). These solenoids can be acquired for ~$10 each. I am unsure of the technical details of these solenoids (pressure, speed, stroke), but I can make approximate guesses based on my experience with them. The brand of solenoid I used on my own car is SPAL. I have sent an email to SPAL USA requesting technical details of their 2-wire actuator. I suspect that a single actuator will not deliver enough force to move a real shoe up-and-down with any kind of respectable speed, so it may require multiple actuators to achieve the desired effect. Fortunately, these actuators are extremely innexpensive. If the shoe becomes problematic, I may decide to (begrudgingly) remove it from the project.

04/05/2007

Learned EyesWeb interface well enough to display pre-digitized video in a contained window, and perform basic manipulation Continued research of actuators Began research of pneumatic systems and actuators Created working EyesWeb patch to filter out receptor arrows Uploaded sample video to Google Video


I'm running through some tutorials that I've found on the web for EyesWeb, and finding it very difficult to pick up. Many of the tutorials were designed for an older version of EyesWeb (3.3), and many things have changed in the GUI, layout, and functionality of the latest version (4.5). So far I haven't managed to do anything useful to the video. I've been able to run some pretty basic manipulation like flipping the image, separating the color channels, merging the color channels (sort of ... the ComposeChannels component seems sort of messed-up in version 4.5 vs. the behavior I saw in the 3.3 tutorial online).

By the end of the day I managed to successfully create some useful EyesWeb patches. The one that is most useful is able to split the video signal 4-ways (using the "Region Of Interest" module), then use a simple background subtraction method to filter out the background around the 4 receptors at the top of the DDR GUI. This allow EyesWeb to focus only on the receptors.

I then spent a good ammount of time getting a visual demonstration of this patch going, along with digitizing it and uploading it to Google Video. The video should be available tomorrow.

I received an email back from SPAL, and their feeling was that the car-lock actuator would not be sufficient for my project. They believe that the continuous and repetitive strain put on the device will cause it to overheat and fail. I was hoping to keep my system totally electric, but it looks like it may not be possible if I want to have the power, response, and durability I am looking for (not to mention budget). And thus begins my foray into the world of pneumatics. I spent a good 2 or 3 hours googling up as much as I could (in a field I have zero prior experience in). The good news is that pneumatics actuators are relatively innexpensive compared to their electric equivelent. They also seem to be much more reliable, and can (potentially) be much faster on the stroke (which means less latency). I didn't feel like spending a whole lot of time learning the technical nitty-gritty on pneumatics, so I started looking at robotics websites. Sure enough, I found websites selling complete pneumatic kits. It looks like I should be able to build a complete pneumatic actuator system for under $500. It's going to take some time to assemble and fine-tune a pneumatic system, but I feel "confident enough" that pneumatics is "the solution" to my actuator problem that I am going to put it off for the time being and focus more closely on the video processing.

04/06/2007

  • Google Video demonstration successfully uploaded and registered
  • Researched output methods from EyesWeb
  • Discovered and researched Phidgets
  • Began research on motion analysis methods using EyesWeb

Ta-da! Behold my awesome Google Video demonstration of my EyesWeb patch! Okay, so it's not super-interesting, but I'm really proud of it!


Description I typed in for the video on Google:


This is a working example of a simple patch I created in EyesWeb http://www.eyesweb.org/

I am taking a live video stream from StepMania (an open-source DDR platform) running on my XBox, and performing some basic operations to focus on the 4 "receptor arrows."

This is done by splitting the video signal 4 ways, each into its own region of interest, and then performing a background extraction using a mask file (that effectively pre-determines the exact locations of the receptors).

This should allow me to more-accurately perform motion analysis on the receptors.

This video processing is performed at 30fps in real-time. The video displayed in my patch is only for demonstrational purposes. The final working patch will not require any output to be displayed to the user.

See http://4LeftFeet.com for more information.

PS: The Audio/Video sync is off. I used CamStudio to screen-capture the video as the patch ran in EyesWeb, and unfortunately it produced a video that does not have a constant frame rate. The audio is totally irrelevant for the scope of this project anyways, but it makes the video more tolerable to watch. Image:smile.gif

I'll look for a screen-capture app that doesn't suck ...


Today I discovered something awesome called Phidgets. Phidgets are essentially innexpensive sensors and output devices that interface with a PC via USB. Okay, well that's cool, but what does it mean for 4LeftFeet? Someone took it upon himself to develop a Phidget server motor control block for EyesWeb. Alright, so a servo may not be the most ideal output mechanism for my project, but so far it seems like a better option than any of EyesWeb's other output methods (UDP packet [requires another PC + latency issues], Com port [same problems], TCP [faster than UDP], Audio [ack ..], write-to-file, OSC [Open Sound Control]). I have a feeling that controlling a servo directly with EyesWeb may end up being the fastest way to trigger a pneumatic actuator. The 4-servo Phidget kit costs about $135 + shipping.

I performed some research into the technical specifications of various USB digitizing devices. The main products I inspected were the Turtle Beach Video Advantage USB and the KWorld Xpert DVD Maker USB 2.0, as well as a few others. The results were very disappointing. The devices digitize and output video to the PC with a latency of anywhere from 250ms to 3 seconds (!!!!). This is simply not acceptable, if I want 4LeftFeet to actually be successful. Latency needs to be at an absolute minimum, especially on the processing end. This means I am forced to find a way to implement FireWire video by any means necessary.

As it turns out, EyesWeb has an "Import WDM Video Stream" block. FireWire video happens to be a WDM Video Stream. Unfortunately, I have not been successful getting EyesWeb to recognize my video camera as a WDM video device...... Until now! It turns out I was having some driver issues that prevented EyesWeb from seeing my DV Camera. Once I took care of this, BAM! Digital video with undetectable latency in EyesWeb! I consider this to be the biggest breakthrough I have made on the project so-far!

I am taking a self-imposed weekend break from my project. So don't expect any updates for the next few days.

Week 2

04/09/2007

  • Downloaded and installed EyesWeb Motion Capture Library

After great pains, I finally discovered how to install the EyesWeb Motion Capture Library. The lack of support for the product is just something that I am going to have to deal with in this project. There is a support forum, but posts go unanswered for months there. I posted a question there a few days ago, but I ended up answering it myself before it even got 10 views (and I'm willing to bet those were all search-engine spiders). To be fair, EyesWeb comes with a decent amount of good documentation and tutorials, however there doesn't seem to be a lot of help on the web for it.

Anyways, here is the trick to getting Motion Capture Library installed onto the latest version of EyesWeb (4.5.02 as of writing this). This will probably be of great help to those finding this page from google Image:smile.gif When attempting to install the Motion Capture Library, you will most likely encounter an error that "ipl.dll was not found" or something to that effect. In case you're wonder what ipl.dll could possibly be (like I was), it's the Intel Image Processing Library. Don't bother trying to find it, because Intel doesn't support it any more, and it is not on their website either. Don't bother trying to download the dll either. Anyways, as it turns out, you cannot install the Motion Capture Library directly to EyesWeb 4.5.0.2. First you must uninstall EyesWeb. Then you must install EyesWeb 4.0.2.0. You can find the download here (since it's not linked directly on the website). Then download and install the EyesWeb Motion Capture Library for EyesWeb 4 here. Then download and install the EyesWeb 4.5.0.2 Update. Of course, It would be too convenient if everything worked perfectly from here, but it doesn't. Be mindful of your shortcuts, because you will then be left with BOTH EyesWeb 4.0.2.0 and 4.5.0.2 installed simultaneously (you'll see both folders in the Programs directory, as well as your Start Menu). Also be aware that uninstalling from Add/Remove Programs does not remove anything from the Programs directory. So what does the uninstall do anyways? I don't know. Anyways, be careful because it appears that you can still run the 4.0.2.0 version once you have "upgraded." I wouldn't recommend running it though, because you'll see some weird things when you run it. Example, from the Info dialog:

image:eyeswebinfo.png


I'm going to go out on a limb and say running the EyesWeb Open Software Platform version 4.5.0.2 with the 4.0.2.0 Development Environment and Kernel is probably a bad idea Image:smile.gif

And as midnight rolls around, I'm going to learn how to use the Motion Analysis Library (hopefully).

04/10/2007

  • Presented project to class
  • Determined milestones for week 3 and week 5 (midterm)
  • Played with Motion Analysis blocks in EyesWeb

04/11/2007

  • Played with signal path manipulation with EyesWeb
  • Downloaded and installed EyesWeb Phidget plugin

My biggest accomplishment today was getting some succesful signal-path manipulation taken care of. I've taken a break from the motion-analysis stuff because I'm having a hard time with it, and my brain needs to focus on things I know I can accomplish for the time being. The signal processing is important because it's how I'm using EyesWeb to get from "There's an Arrow in my Receptor" to "Now I need to trigger an actuator." Rather than getting too heavy on the verbage, I'll let the video example demonstrate what I've done:

Description I typed in for the video on Google:

This is an EyesWeb patch I created that manipulates a signal path once an event is triggered. For the context of this video, the event is triggered by a mouse-click on the "Bang Generator" button.

This signal is intitially run through a "Counter" block. This is used to determine the number of mouse clicks. It is also used to trigger the "Actuator Down" condition. This is done by splitting the signal from the Counter, putting a 200ms delay on one of the signal paths, and then subtracting the two signals. The result is a 200ms window where "true" (1) boolean condition can be sent to the actuator.

An "Actuator Up" condition is created by sending a separate "true" boolean condition whenever the "Actuator Down" condition is "false." Conversely, whenever the "Actuator Down" condition is "false," the "Actuator Up" condition will be "true." The default position of the actuator is "up" (up = true, or 1).

The "Number of Actuations" is calculated by counting the number of times the actuator changes positions. Starting from the default position, a single click will result in two actuations (one down, and one up).

I needed to put a switch into the circuit to ignore counting the "Number of Actuations" until the first input is received. This was done because the circuit initializes itself when the patch is started, and effectively actuates down and up for a single cycle.

For the context of my project, the event will be triggered when a DDR-arrow has fully occupied the space of a receptor, rather than a mouse click in the EyesWeb interface.


I'm going to replace this webpage with a Wiki. Partly because I'm getting sick of coding HTML as I work on my project, and partly because this page is getting extremely cluttered and difficult to navigate. I'm currently uploading MediaWiki to my server. Hopefully it doesn't puke.

04/12/2007

  • Configured Wiki
  • Ported content over to wiki
  • Formatted everything to make it look nice

This is it! The new http://4leftfeet.com

I've spent pretty much all day getting this stupid thing set up, setting the permissions, adding some necessary (and some clever) extensions to do stuff I want, like embed google and youtube videos, obfuscate my email address so spambots can't find it, and automatically embed emoticons. Who could live without that Image:smile.gif Image:frown.gif Image:wink.gif Image:eek.gif Image:mad.gif Image:cool.gif

The old http://4leftfeet.com will now live here from now on, and exist only for archival purposes.


Week 3

4/17/2007

I haven't made any updates in the last few days, but I have been working very hard on motion tracking, and event triggering in EyesWeb. I've made some substantial progress. After a brainstorming session from talking on the phone with my brother, I came upon a very good idea on tracking arrows.

Receptors with "tracking zones" represented by red dots.
Receptors with "tracking zones" represented by red dots.

My previous ideas were to track the area inside the receptors, tracking the arrows as they go up the screen, or some combination of the two ideas. My new idea is to track a very small area, just outside the receptor. There are several benefits to using this technique:

  • Only a very small area needs to be tracked
  • Easily able to differentiate between single arrows and freeze arrows
  • Not affected by stopping-tempo mid-song (tempo is unlikely to stop while an arrow has started to penetrate the receptor)
  • System latency can be easilly compensated by adjusting the y-value
  • Tempo changes can potentially be compensated by adjusting the y-value on-the-fly

The biggest advantage, by far, is being able to detect single arrows, freeze arrows, and even the dreaded overlapping arrows! This is possible because of the location I have chosen to track. The arrows in the game are essentially diamond-shaped. This means that if I am only looking for the outer edge of the arrow, there will be a short "blip" as the edge of the arrow passes by. This also means that I will be able to detect freezes (a longer blip), as well as doubles, triples, etc. where the arrows are overlapping each other.

Once I built my patch, I tested it out on an 8-footer, heavy difficulty (that's the highest difficult ranking of a song in DDR). The execution was nearly flawless. I had a few glitches every now and then, but very good as a whole. The timing was almost perfect throughout, and the sample song had some of every "advanced" component to it, including freezes, doubles, triples (and higher, I think there are a few sections with 6+ overlapping arrows), as well as tempo stops and changes.

However, the major flaw with this is that my CPU usage is pretty much maxed out. I need to either figure out how to make my patch more efficient, or drop some serious money on a CPU upgrade. I was unable to take a screen capture of the path in action due to the fact that my CPU was so slammed. Donations anyone? Seriously, I wonder if I can get corporate sponsorship for this Image:smile.gif I'd be more than willing to put your logo on my robot in exchange for an AMD FX-74 CPU (or similar) equipped PC!

4/18/2007

  • Performance tweaks
  • Modified method of pixel tracking
  • Tested UDP output from EyesWeb

I've come upon a big discovery today, which is improving my EyesWeb performance significantly. Almost every tutorial I've seen for EyesWeb uses the ROI block (Region Of Interest). This block restricts image operations to a cropped area of the original image. This theoretically improves performance over performing operations on the entire area of the image, if its not necessary. The problem is that the entire image is still being processed every time it runs through ROI restricted blocks, even if the calculations/manipulations/analysis/whatever is being restricted to the ROI region. When I'm dealing with 720x480 image, and I'm only interested in a 10x10 pixel subset of that image, that's a lot of pixels being processed that I'm not interested in.

As it turns out, there is a block called "extract" that basically performs an ROI *and* eliminates all the pixles that I don't care about! I tried a sample patch (my "smart" background subtraction patch), replacing ROI with Extract, and CPU usage for EyesWeb dropped from 38% to 20%! Almost 50% less overhead. I expect I should see similar results when I replace all of the ROI's with Extracts in my other patches.

I've discovered a method of pixel tracking that is far superior to my previous method. I was previously utilizing EyesWeb's motion analysis package to track objects (in a 1-pixel space), then performing an object count operation. Essentially, any time a pixel was "white" it would determine there was on "object" there, and trigger an event.

I was able to improve the performance of my patch significantly be removing the motion analysis blocks completely and instead using the image-to-matrix block. This essentially assigns a number from 0-255 to every pixel in a selected area and sticks it in an array that is constantly updated. In my case, I only care about 1 pixel, and my background-subtracted image is run against a high threshold, which means the value of my pixel will either be 0 or 255. Am am then able to easilly use a single pixel as a switch to trigger an event.

Through the two methods I came up with today, I was able to completley re-write my EyesWeb patch, which was previously suffering poor performance, and maxing out my CPU 100%, to having it run flawlessly at only 25-30% CPU usage. I put together a video of my patch in progress, and spent pretty much all day attempting to upload it to Google Video, which has not been cooperating.

4/19/2007

  • Google Video Demo Online!

My google video demonstration is finally online after much delay!

This video demonstrates my EyesWeb patch running in real-time to an 8-footer song, Heavy difficulty (the highest difficulty). The actuators are represented by the four black/white arrows in the bottom right corner of the screen. Each time an actuator is to be pushed down, a white arrow will flash in the appropriate square. In a few instances, the flash was not picked up in the video, due to the low framerate of Google Videos. There is also a counter below each arrow to show the number of actuations. Unfortunately that doesn't show up so well in the video either.

Timeline

  • Week 5: Video processing should be nearly complete (precise indication of when each arrow event occurs)
  • Week 7: Pneumatics, and all other hardware (including lumber) should be acquired
  • Week 9: All hardware and software should be completely constructed and interfaced ... work out the bugs
Personal tools