Category Archives: Blandet

 

Deep Learning

download article

To dig even deeper into deep learning, please have a look at the technical report I wrote on my findings (PDF document).

I have had the pleasure of diving into the deep waters of deep learning and learned to swim around.

Deep learning is a topic in the field of artificial intelligence (AI) and is a relatively new research area although based on the popular artificial neural networks that supposedly mirror brain function. With the development of the perceptron in the 1950s and 1960s by Frank RosenBlatt, research began on artificial neural networks. To further mimic the architectural depth of the brain, researchers wanted to train a deep multi-layer neural network – this, however, did not happen until Geoffrey Hinton in 2006 introduced Deep Belief Networks.

Recently, the topic of deep learning has gained public interest. Large web companies such as Google and Facebook have a focused research on AI and an ever increasing amount of compute power, which has led to researchers finally being able to produce results that are of interest to the general public. In July 2012 Google trained a deep learning network on YouTube videos with the remarkable result that the network learned to recognize humans as well as cats, and in January this year Google successfully used deep learning on Street View images to automatically recognize house numbers with an accuracy comparable to that of a human operator. In March this year Facebook announced their DeepFace algorithm that is able to match faces in photos with Facebook users almost as accurately as a human can do.

To get some hands-on experience I set up a Deep Belief Network using the python library Theano and by showing it examples of human faces, I managed to teach it the features such that it could generate new and previously unseen samples of human faces.

The ORL Database of Faces contains 400 images of the following kind:

facesORL

By training with these images, the Deep Belief Network generated these examples of what it believes a face to look like

facesORLsamples-2

The variation in the head position and facial expressions of the dataset makes the sampled faces a bit blurry, so I wanted to try out a more uniform dataset.

The Extended Yale Face Database B consists of images like the following

facesYALE

and in the cropped version we have 2414 images that are uniformly cropped to just include the faces.
Training the Deep Belief Network with this dataset, it generated these never before seen images that actually look like human faces. In other words; these images are entirely computer generated, as a result of the Deep Learning algorithm. Based only on the input images the algorithm has learned how to “draw” the human faces below:

facesYALEsamples-2

Interns: Oculus Rift development challenge

oculus-vr-logo

20140326_152233

During our internship at Hinnerup Net A/S we got a three day challenge to try out the Oculus Rift SDK kit and see what we could come up with in such a short amount of time. Anything would be acceptable, as long as the Oculus Rift device was used and a working demo could be presented on it.

You would think something like: “What? Only three days?! With no initial knowledge of 3D programming and/or VR? With no experience with Oculus Rift? Forget it! It simply can’t be done!”. Well, as we will demonstrate, it’s far easier than you might expect to get up and running with the Oculus Rift device.

First thing we did was to head to the Oculus Rift homepage and download the Software Development Kit. Then we proceeded to plug in the Rift and perform a calibration of the unit. After putting the Oculus Rift device on and spinning blindly around we encountered a smaller issue. The software proclaimed that we weren’t spinning enough, after a few extra tries with the same errors, and getting mildly motion-sick we decided that calibration probably wasn’t important as the Rift could deliver a satisfactory picture, so we moved on.

We then proceeded to taking a look at how we could make our own “demo” for the Oculus Rift. From the Oculus Developer pages, we saw that it supported both Unity and the Unreal engine. Based on what we had heard about Unity and our already existing curiosity about it, we decided to go the Unity route and see what would happen. We downloaded the Rift SDK for unity, and accepted the offer for 1 month free trial of Unity pro. The SDK came with a Unity demo project to test out how the integration worked.

Untitled


After just playing around for some time in the Unity SDK, we decided to try downloading a Unity project that did not have Rift support with the goal to add support for Oculus Rift to it. Using the supplied Unity package it was pretty straight forward to do this. All we had to do was to either replace the standard Unity camera with the supplied Rift camera, or replace the player controller with a Rift controller, depending on the type of project. This was the first time we tried working with Unity so it took some time to familiarize ourselves with the tools and the API.

 

We decided to download a demo from Unity Asset Store called “Bootcamp”, and modify it to support Oculus Rift. We removed the player controller and changed it with a “ovr player controller”, which was fairly easy to accomplish. After that we thought it could be fun to place a spaceship into the game, so we downloaded a spaceship model and placed it into the project.


Scene before (click to enlarge):
WithoutSpaceship
Scene after: (click to enlarge)
WithSpaceship


With that done we wanted to try to build something from scratch, so we inserted some terrain and blocked the edges with mountains. Shamelessly using parts of the “Project: Space Shooter” from the Unity’s learn section, we added visible shots using a texture and one of the built in shaders. To give the Rift controller the option of shooting we had to go into the scripting section and perform some minor modifications to the code.

 

unityScript

The Update method is called on each active object for every frame generated. Line 190 to line 194 is what we had to do to make a player able to shoot. The nextShotTime, fireRate, shot and shotSpawn are fields we had to add. Only shot and shotSpawn are of any interest, shot is an actual GameObject that should be spawned on line 192. The shotSpawn is an empty GameObject which we linked to the camera, in relative positions. That means that the spawner will follow the camera view, so shots will spawn in the direction a player is looking. The Instantiate method basically just makes a new shot and spawn it at the position of the shotSpawn.

 

So … In conclusion and to sum it all up: In a period of three days we went from not knowing anything about 3D programming, Virtual Reality, Oculus Rift or Unity, to being able to modify an existing Unity project and make it take full advantage of the immersive experience a Oculus Rift Virtual Reality display unit can offer.

 

Bézier Spline Distance Glow

I Hinnerup Net har vi, mest bare fordi vi kan, udviklet en WebGL shader der kører på computerens GPU (bedre kendt som grafikprocessor). Shaderen beregner konstant afstanden fra et vilkårligt punkt til en dynamisk skiftende kvadratisk Bézier-kurve.

Advarsel: Formler følger herefter i stride og voldsomme mængder. Hvis du vil direkte til demoen så hop til bunden af indlægget! 😉

Indenfor computer-graphic området er Bézier-kurver ikke til at komme udenom. Oprindeligt blev Bernsteins polynomie introduceret som en del af teoretisk matematik i 1800-tallet af Charles Hermite og Sergei Bernstein. Bézier-kurven, der er en udvidelse heraf, blev en realitet i 1900 tallet, da Pierre Bézier og Paul de Casteljau, fandt på at bruge del-summene i Bernsteins polynomie til at skalerere et array af vektorer (også kaldet kontrolpunkter).

Ligningen for en Bézier-kurve af nte-grad er som følger:

AnyBezierSpline
P(t) er et punkt P, som ændrer sig som en funktion af t.
t er en floating point værdi, som bruges i intervallet 0.0 til 1.0.
n er en integer, som har værdien antal kontrol punkter minus 1.
i er en integer, som bevæger sig fra 0 til n.
P er et array af vektorer eller kontrol punkter, som indekseres af integeren i.
! er fakultet tegnet, som er produktet af en talrække af de positive hele tal fra 1 til og med tallet selv. Fakultet-funktionen n! kan dermed udtrykkes som:

Fakultet
En Bézier-kurve af nte-grad har n+1 kontrolpunkter. I dette tilfælde ønsker vi at have 3 kontrolpunkter. Derfor skal ligningen blot løses for n = 2, hvilket giver følgende:

 QuadraticBezier
Herunder er koden til at beregne et punkt på Bézier-kurven.

vec2 getPositionOnBezierCurve(float t, vec2 p0, vec2 p1, vec2 p2) 
{
	float fOneMinusT = 1.0-t;
	vec2 pos = fOneMinusT*fOneMinusT*p0+2.0*fOneMinusT*t*p1+t*t*p2;
	return pos;
}

 Hvis det undlades at gange kontrolpunkterne på hver af del-summene i Bernsteins polynomial, kommer de 3 kurver af t til at se således ud:

QuadraticBernstein
Hvis de 3 kontrolpunkter bliver ganget på de 3 kurver af t, så bliver resultatet en kvadratisk Bézier-kurve.

GraphOfQuadraticBezier
At finde afstanden imellem et vilkårligt punkt Q og denne kvadratiske Bézier-kurve er overkommeligt hvis t er 0.0 eller 1.0, da P(t) jo her vil give henholdvis det første eller sidste iforvejen kendte kontrolpunkt. Men da shaderen skal virke med et vilkårligt punkt Q og en vilkårlig kvadratisk Bézier kurve skal der lidt mere formel-galop til. For at kunne beregne den korteste afstand imellem et vilkårligt punkt Q og et punkt på Bézier-kurven P(t) skal tangenten til P(t) anvendes. Tangenten beregnes ved at finde den afledte af funktionen P(t).

DerivativeQuadraticBezierSpline


Ligningen for tangenten til P(t) kan omskrives til:

DerivativeQuadraticBezierSpline1
Dette kan reduceres yderligere til:

DerivativeQuadraticBezierSpline2
A_Equal
B_Equal
Den korteste afstand imellem et vilkårligt punkt Q og et punkt på Bézier-kurve P(t) vil være, der hvor skalarproduktet imellem vektoren Q-P(t) og tangenten til punktet P(t) er lig med 0.

 DistanceEquation
. Skalarproduktet giver som bekendt cosinus til vinkelen imellem vektorene. Cosinus giver nul ved 90 grader, så med andre ord finder ligningen den værdi af t hvor vektoren Q-P(t) er ortogonal med tangenten til punktet P(t). Det er netop dette punkt, som vil have den korteste afstand til Bézier-kurven fra punktet Q.

Når ligningen bliver løst for t bliver det til en trediegradsligning:

CubicEquation 

Udtrykt i kode, ser overstående sådan ud:

vec2 dP0Q = p0-q;
vec2 A = p1-p0;
vec2 B = p0+p2-p1*2.0;
float a = dot(B,B);
float b = dot(A,B)*3.0;
float c = dot(A,A)*2.0+dot(dP0Q, B);
float d = dot(dP0Q,A);


Der findes forskellige måder til at løse en trediegradsligning på. Vi valgte at bruge Cardano’s metode.

Udtrykt i kode bliver Cardano’s metode til følgende:

#define EPSILON 0.000000001
#define PI 3.14159265358979
int findRoots(float a, float b, float c, float d, out float r[3]) {
	if (abs(a) > EPSILON) {
		float z = 1.0/a;
		float d3 = 1.0/3.0;
		float d27 = 1.0/27.0;
		a = b*z;
		b = c*z;
		c = d*z;
		float p = b-a*a*d3;
		float q = a*(2.0*a*a-9.0*b)*d27+c;
		float ppp = p*p*p;
		float D = q*q+4.0*ppp*d27;
		float delta = -a*d3;
		if (D > EPSILON) {
			z = sqrt(D);
			float u = (-q+z)*0.5;
			float v = (-q-z)*0.5;
			u = sign(u)*pow(abs(u),d3);
			v = sign(v)*pow(abs(v),d3);
			r[0] = u+v+delta;
			return 1;
		}
		else if (D < -EPSILON) {
			float u = sqrt(-p*d3)*2.0;
			float v = acos(-sqrt(-27.0/ppp)*q*0.5)*d3;
			r[0] = u*cos(v)+delta;
			r[1] = u*cos(v+2.0*PI*d3)+delta;
			r[2] = u*cos(v+4.0*PI*d3)+delta;
			return 3;
		}		
		else {
			q = sign(q)*pow(abs(q)*0.5,d3);
			r[0] = 2.0*-q+delta;
			r[1] = q+delta;
			return 2;
		}
	}
	else {
		if (abs(b) <= EPSILON && abs(c) > EPSILON) {
			r[0] = -d/c;
			return 1;
		}
		else {
			float D = c*c-4.0*b*d;
			float z = 1.0/(2.0*b);
			if (D > EPSILON) {
				D = sqrt(D);
				r[0] = (-c-D)*z;
				r[1] = (-c+D)*z;
				return 2;
			}
			else if (D > -EPSILON) {
				r[0] = -c*z;
				return 1;
			}
		}
	}
	return 0;
}

Løsningen af trediegrads ligningen vil give et sted imellem 0 og 3 løsninger. I tilfælde af mere end 1 løsning, så bruges kun den løsning hvor t er indenfor intervallet 0.0 til 1.0. Når den rigtige værdi af t således er fundet, skal denne sættes ind i den oprindelige ligning for en kvadratisk Bézier-kurve for at beregne punktet K, som vil være punktet på kurven der er tættest på Q. Til sidst beregnes selve afstanden g imellem K og Q med Pythagoras.

 DistanceToPointOnBezierCurve

Koden til henholdvis at beregne K og sikre at t er i intervallet 0.0 til 1.0 og samt beregne afstanden imellem K og Q ser sådan ud:

float g = distance(q,getPositionOnBezierCurve(clamp(r[0],0.0,1.0),p0,p1,p2));
g = min(g, distance(q,getPositionOnBezierCurve(clamp(r[1],0.0,1.0),p0,p1,p2)));
g = min(g, distance(q,getPositionOnBezierCurve(clamp(r[2],0.0,1.0),p0,p1,p2)));

Afstandsværdien g bruges til at indeksere hele farvespektret og hermed oversat til en enkelt farve for hver pixel på skærmen. Koden til at gøre dette er:

float white = max(5.0-pow(d,2.0),0.0);
float red = max(-0.0007*pow(d-40.0,2.0)+1.0,0.0);
float yellow = max(-0.0007*pow(d-75.0,2.0)+1.0,0.0);
float green = max(-0.0007*pow(d-110.0,2.0)+1.0,0.0);
float cyan = max(-0.0007*pow(d-145.0,2.0)+1.0,0.0);
float blue = max(-0.0007*pow(d-180.0,2.0)+1.0,0.0);
vec4 c = vec4(white,white,white,0);
c += vec4(red,0,0,0);
c += vec4(yellow,yellow,0,0);
c += vec4(0,green,0,0);
c += vec4(0,cyan,cyan,0);
c += vec4(0,0,blue,0);

Hvis din browser understøtter WebGL kan du se en realtime version herunder.
BezierDistanceGlowShader