Nov 14

When you study computers-science at the University of Paderborn you get taught the Java programming language in your first two semesters (which you might already know by then). Afterwards you get a brief introduction on other programming paradigms and the corresponding languages: C, Pascal, Lisp, ML, SmallTalk, SQL, Prolog, …

Starting from your third semester you are expected to learn any arbitrary programming language by yourself. That’s it. I don’t know if this equations works out or not. But if you encounter someone who claims to be a “software-architect” and did study in Paderborn and considers programming to be beneath him,… well I think you catch my drift.

I can only pity all people who consider programming as beneath them, because they miss out on one of the important lessons of computer science. Language does define how we think. If we can express some topic in such a way, that someone other with a certain background on that topic can grasp and understand it, then we have truly conceived that topic ourselves. Here the computer is nothing more than an unforgiving and impolite listener. You need to be concise and exact or everything will blow.
If all programming languages were equivalent from the programmers point of view, everything would still be written in Assembler. This is where we Nerds who collect programming languages get our revenge. If you now still think of programming as something unworthy of a gentleman-engineer, well then: Fuck you very much.

I recently held a small workshop on Python at the PC^2. Here are the slides for the language introduction. I will post the slides of the project and it’s source code soon. Nonetheless my ambition for professionality in holding this workshop, I am still an undergrad student. A student who collects programming languages and has some expertise on this field, but still a student.

Now some fellow student of computer science asks whether he can have a certificate for attending this workshop. For his CV most probably. Does this guy want to ridicule himself? From the third semester he was expected to learn any arbitrary programming language in approx. one week and now he wants to have a certificate which states that took a workshop, held by a fellow undergrad, in order to learn a language as simple as Python.

His request might be correlated to the fact that in germany, more attention is paid to titles and certificates than to real skills. A behaviour which certainly applies to job applications. This feels like a Dilbert Comic, or a sketch from Loriot, that I now have to hand out certificates to fellow students, so they can pimp their CV with yet another certificate. I mean, where’s the justice? Who gives me a certificate for holding such a workshop, or even certifying my expertise in the Python language? Sadly: no one. This leaves me in the tricky question how I am going to pimp my CV. A difficult question indeed.

This is as disturbing as it is ridiculous and since we recently celebrated the 85th birthday of of Loriot, whose subtle, insightful and gentle humour has still to be surpassed in germany, I have taken the time of hand crafting a certificate of attendance for this workshop I held and I will happily hand it out to every attendee who want’s to add it to his collection. I hope you will find it as artistically pleasing as informatory. Behold:

Edit: I have redone that certificate, because the photocopier had problems with the original one.

P.S.: I hope you all have one of those valuable “Yodel Diplomas”.


Sep 13

At the moment I’m preparing a workshop for the Python programming language and thus I’m looking for interesting examples. They should be short and sweet and I already stumbled over The sieve of Erastosthenes. Unfortunately, I was a bit keen to present some advanced data structures from Python. Because of that the performance of my implementation was bad, really bad. Frank Hellweg, a friend of mine, and someone who excels at theoretical computer-science had a little optimization-session with me, resulting in a total speedup of roundabout 30000. This was my first approach of implementing the sieve:

  1. def sieve1(N):
  2.     primes = set()
  3.     numbers = set(range(2,N))
  4.     while numbers:
  5.         current_prime = min(numbers)
  6.         multiples = set([n for n in numbers if not n % current_prime])
  7.         numbers -= multiples
  8.         primes.add(current_prime)
  9.     return primes

If you are a little bit into computational complexity and python, you should spot several problems with performance:

  • I’m using min(...) which has a runtime of O(N) in the number of elements in the set. So we are doing an O(N) operation every while-loop. Not good.
  • The outer while-loop runs, until all numbers are removed from the set. Here some unnecessary loops might be made.

Frank pointed these problems out and I made a futile attempt at reducing the runtime cost of min() by using lists instead of sets. Since now the lists are ordered, min() can be done in O(1).

  1. def sieve2(N):
  2.     primes = list()
  3.     numbers = range(2,N)
  4.     while numbers:
  5.         current_prime = min(numbers)
  6.         numbers = [n for n in numbers if n % current_prime]
  7.         primes.append(current_prime)
  8.     return primes

I compared these two approaches with the cProfile module, for sieves up to 100000 and got the following results:

   38375 function calls in 63.917 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   63.917   63.917 :1()
        1   24.233   24.233   40.626   40.626 erastosthenes.py:10(sieve1)
        1   18.460   18.460   23.277   23.277 erastosthenes.py:20(sieve2)
        1    0.014    0.014   63.917   63.917 erastosthenes.py:52(main)
     9592    0.024    0.000    0.024    0.000 {method 'add' of 'set' objects}
     9592    0.016    0.000    0.016    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    19184   21.163    0.001   21.163    0.001 {min}
        2    0.007    0.004    0.007    0.004 {range}

We can already see a speedup of approximately 50% from sieve1 to sieve2 but Frank then proposed an even faster approach. It uses an array of boolean values, whose indices represent the corresponding numbers. If ‘x’ is prime, the value at that position in the array is True. The algorithm now sweeps over the array and looks for already found prime numbers up to sqrt(N), if a number is marked as prime (i.e. array field is set to true) we mark all it’s multipliers as not prime. This can be done easily with a second for-loop. The array is initialized with all values as True and we start with 2, mark all even numbers as non-prime and go to 3. Now 3 is found to be prime and all it’s multipliers are marked as non-prime, then 4 is found as non-primed and skipped,… and this is continued up to sqrt(N).

At first I could not grasp why we only need to look until sqrt(N). But consider that we have reached sqrt(N), and marked all non-prime numbers according to our algorithm. Frank now told me boldly that at this point there are only correctly marked primes left in the array above sqrt(N) and up to N. This is true, even if it’s not obvious at first. But consider the following: every non-prime number greater than sqrt(N) can be factorized into two primes, where one prime is smaller than sqrt(N), otherwise we have a non-prime number which is greater than sqrt(N)*sqrt(N) = N, so it’s not in our search range. So if we have correctly found all primes up to sqrt(N) and marked all their multipliers as non-primes, only correctly marked primes greater sqrt(N) and smaller N remain.

  1. from array import array
  2.  
  3. def sieve3(N):
  4.     numbers = array('b', [1] * N)
  5.     sN = int(math.sqrt(N))
  6.  
  7.     for i in xrange(2,sN):
  8.         if numbers[i]:
  9.             for j in xrange(2*i, N, i):
  10.                 numbers[j] = 0
  11.  
  12.     return [i for i in xrange(2,N) if numbers[i]]

This algorithm does not need high-level data structures like linked-lists, so we tried to get the most simple array which could be found in Python and hence tried the array module. But for reference we also tried the standard lists of Python.

  1. def sieve4(N):
  2.     numbers = [1] * N
  3.     sN = int(math.sqrt(N))
  4.  
  5.     for i in xrange(2,sN):
  6.         if numbers[i]:
  7.             for j in xrange(2*i, N, i):
  8.                 numbers[j] = 0
  9.  
  10.     return [i for i in xrange(2,N) if numbers[i]]

These are our performance results:
   38379 function calls in 98.769 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   98.769   98.769 :1()
        1   37.161   37.161   60.856   60.856 erastosthenes.py:10(sieve1)
        1   29.776   29.776   37.521   37.521 erastosthenes.py:20(sieve2)
        1    0.211    0.211    0.211    0.211 erastosthenes.py:29(sieve3)
        1    0.120    0.120    0.120    0.120 erastosthenes.py:40(sieve4)
        1    0.061    0.061   98.769   98.769 erastosthenes.py:52(main)
        2    0.000    0.000    0.000    0.000 {math.sqrt}
     9592    0.027    0.000    0.027    0.000 {method 'add' of 'set' objects}
     9592    0.017    0.000    0.017    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    19184   31.387    0.002   31.387    0.002 {min}
        2    0.009    0.005    0.009    0.005 {range}

Now the speedup is quite gigantic, at a factor of roundabout 180. Interestingly, the standard Python lists are two times faster than the arrays provided by the array module.I must have missed the point of that module. But we wanted to go down to the bare metal and compare the performance with a C program.

Due to the fact, that all you need is an array and two for-loops, we tried to implement the algorithm in C and embed it as a module in Python, so we could make a better comparison. The most difficult part here was getting the algorithm properly wrapped and packed into a .so-library usable from Python. We saved the following source code in a file called erastomodule.c and compiled it into a .so-library for dynamic linking.

  1. #include "Python.h"
  2. #include <math .h>
  3. #include <stdlib .h>
  4. #include <string .h>
  5. #include <stdio .h>
  6.  
  7. static PyObject *
  8. erasto_sieve(PyObject *self, PyObject *args)
  9. {
  10.     unsigned long N = 0;
  11.     unsigned char *numbers = NULL;
  12.  
  13.     PyObject *primes = NULL;
  14.     PyObject *prime = NULL;
  15.  
  16.     unsigned long i,j;
  17.  
  18.     /* Parse the given parameter */
  19.     if (!PyArg_ParseTuple(args, "k", &N))
  20.         return NULL;
  21.  
  22.  
  23.     /* Initialize datastructure for sieve */
  24.     numbers = malloc(sizeof(unsigned char) * N);
  25.     memset(numbers, 1, N);
  26.    
  27.     /* This is the actual sieve !!! */
  28.     for (i = 2; i < sqrt(N); i++) {
  29.         if (numbers[i]) {
  30.             for(j = 2*i; j < N; j+=i) {
  31.                numbers[j] = 0;
  32.             }
  33.         }
  34.     }
  35.  
  36.     /* create a new PyList */
  37.     primes = PyList_New(0);
  38.     if (primes == NULL)
  39.         return NULL;
  40.  
  41.     /* Put primes into list */
  42.     for(i = 2; i < N; i++) {
  43.         if (numbers[i]) {
  44.             prime = Py_BuildValue("k", i);
  45.             if(PyList_Append(primes, prime))
  46.                 return NULL;
  47.         }
  48.     }
  49.  
  50.     return primes;
  51. }
  52.  
  53. static PyMethodDef ErastoMethods[] = {
  54.     {"sieve",  erasto_sieve, METH_VARARGS, "Sieve of Erastosthenes."},
  55.     {NULL, NULL, 0, NULL}        /* Sentinel */
  56. };
  57.  
  58.  
  59. PyMODINIT_FUNC
  60. initerasto(void) {
  61.     PyObject *m;
  62.  
  63.     m = Py_InitModule("erasto", ErastoMethods);
  64.     if (m == NULL)
  65.         return;
  66. }

Then we could import it like a normal module from Python and use the function we called erasto.sieve for our computation. The following is a comparison of sieve4 and the erasto.sieve function we implemented in C. Again we have a speedup of roundabout 180, resulting in a total speedup of roundabout 32400 when compared to my initial approach.

   6 function calls in 0.149 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.149    0.149 :1()
        1    0.126    0.126    0.126    0.126 erastosthenes.py:40(sieve4)
        1    0.017    0.017    0.149    0.149 erastosthenes.py:52(main)
        1    0.006    0.006    0.006    0.006 {erasto.sieve}
        1    0.000    0.000    0.000    0.000 {math.sqrt}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

The conclusion should be clear:

  • Think twice about your algorithm
  • Profile mercilessly
  • And port expensive functions to C

Happy optimizing!

Apr 5

The more I use Python, the more it becomes my favourite Programming Language. Two days ago I started learning 3D-Programming with OpenGL.
The Python bindings proved to be fast and well documented and everything was quite nice and easy. But as I’m taking a course on 3D-Programming at University (called Computergraphics 2) I was forced to port all 3 Assignments already done in Python to JOGL. And it proved to be more of a pain in the ass than I had expected. Just compare this lines of Java:

float light0_position[] = { 2.0f, 5.0f, 1.0f, 1.0f };
gl.glLightfv(GL.GL_LIGHT0, GL.GL_POSITION, light0_position, 0);

whit this line of Python:


glLightfv(GL_LIGHT0, GL_POSITION, (1.0, 1.0, 1.0))

Do you recognize the significant reduction of namespace-clutterage in Python. There’s no
`gl.glSomeFunction` just `glSomeFunction`. Less is more, I do say. Thats due to Pythons ingenious
System of importing things into the local namespace. Sure, you have to be careful, but for things like OpenGl it’s
dead useful and you do not need to fear name-collisions: OpenGL origins from C, which is totally namespace-ignorant, thus
everything carries an unique identifier. This is almost obvious when you take a look at constants and functions: its like the
namespace was pulled into the name itself.

The usefulness of namespaces arises from the possibility that you can load stuff from one namespace into yours and thus omit
the necessity of prefixing things with the considered namespace. Java itself is ignorant regarding Module-Functions. Either you
use Object-Methods or Class-Methods and hence you need to prefix those with the Object or Class.

What bothers me a little is the fact that the JOGL-Developer did not try to port the OpenGL-API a little more conveniently to Java.
On one Hand they could have stripped all the Class-Constants of the GL-Class from their GL thus transforming a `GL.GL_SOME_CONSTANT`
to just `GL.SOME_CONSTANT` and secondary they could have done this to the gl-Objects too, and (I think you are already guessing it)
transforming `gl.glSomeFunction` to just `gl.someFunction`.

And by the way, the PythonOpenGL-API-Reference is way better than the JOGL-Reference.

That’s the rant for today. I will need to find a way to get comfortable with JOGL, since regarding my course I will have no other choice.
Cruel World!


Nov 15

At the moment I’m at a most painful taskt of converting different Java types into byte-array for sending over the network.
I need to remember that Java stores everything in a 2complement notation.

But to handle all that stuff I’ve written my own Byte-O-Rama Class in Java. It’s a shit that Java isn’t even able to encode
its basic data-types into byte[] in a simple and convenient manner. I don’t talk ByteBuffers/writers and stupid stuff i talk
byte[] barr = (byte[]) intValue; Plain and simple Type-Conversion. Shouldn’t even be explicit. But since XML is overhyped in Java
no one seems to be concerned that anybody out there might use playn byte[] ’s as a fast, overhead-free Method to serialize his objects.

Python conveniently offers this: http://www.python.org/doc/current/lib/module-struct.html

If I’ve got time and anybody bothers I will go and publish my Byte-O-Rama.


Aug 11

My Notebook ist back from repair. This time everything went well. When I sent it in I removed, as usually, the HDD to prevent data-loss. This confused some n00b-technicians at ASUS, who were to incompetent to solve the error AND remember that no HDD was built in. A level 2 senior enginner took over the case, repaired the LCD and put in a new HDD. Thanx a lot, thats a good backup HDD!


Jun 8

During the last week I experienced some rather weird I/O behaviour on Huberts one-and-only harddisk. It’s been a 3Gigs Seagate Medalist 3221; yeah it’s been. Because now this is what I call a dead hard-drive. After I witnessed another I/O crash while accessing a website hosted on hubert, I today decided to replace it.

My first approach was:

  1. get new hard-drive
  2. create identical partition structure on new drive
  3. copy partition data
  4. have fun

Did you see the important thing I missed?
Right: I missed to install the bootloader. Silly me.

But it came worse and from behind. The first three points went good. But at reboot time the BIOS of my good’ol PII/BX-Mainboard wouldn’t accept the Hitachi drive. Great. Then the old Seagate just decided to die totally and everything I got from her wer some cryptic LILO errors where everything should run smooth and shiny.
Thus I struggled to get the old drive back one more time:

  • tried to get into the LILO error - useless
  • tried to boot Knoppix and reinstall LILO in hda’s MBR - no good!
    Knoppix prove to be unsuitable for such a task. It has a weird chroot behaviour.
  • tried an Ubuntu Install-CD for LILO recovery - LILO reinstallation succeeded, but at boot-time this only proved that LILO was not the source of the error.
  • I also tried to reconfigure the old HD in my BIOS, since I had to change everything for my new Hitachi-drive. - no succes with this either.

Well, anyone who knows me well enough, knows also that this is not the point where the Prakti resigns, isn’t it?
But then I was lucky too: found an old 10GB IBM HTTA drive, copied the data from the hitachi onto it, ran LILO and *tadaa* Hubert is back.

As a Summary:
The most-important commands today were: tar for copying consistently (cp would result in damaged symlinks) , mke2fs, cfdisk for partitioning, chroot, and of course mount. Everything was done from a shell from within a live-system of an Ubuntu-Install-CD.

Good Fight, Good Night.


Feb 21

Today I gave Ubuntu-Linux a spin on my spare-partition. I have to say: it really rocks. I have automagically a desktop to use with great overall functionality within half an hour. Just great. I backed up my data yesterday so today I can reformat my hard-disk and get Ubunto on my main drive.

Edit: Ubuntu is up and running,… but the ACPI-Problem remains. Got to rebuild a kernel tonight.

Edit2: ACPI-Problem is done. First I tried to go without rebuilding the kernel, since Ubuntu supports the inclusion of a new dsdt over the initrd-image, but this failed. Now the third kernel is in the oven to fix the last issues with the fglrx-module for my ati-card.
Next step will be the reconfiguration of xorg to use the fglrx driver. If this is done, I’m through and everything else is just cleanup.


Feb 14

Turing-Train-Terminal is a Model-Railroad functioning like a Finite-State-Machine.


Feb 8

Today I just gave the current beta-Live-CD of Ubuntu a short spin on my notebook. Runs quite impressive. I think I’m going to scoop around to get some more info on the overall distro-arch but since it is a debian derivate everything should be fine.
Maybe I’m going to swicht my Linux distro on my Laptop by May. Since then I’ve way too much to learn and backing up all that data from my laptop is quite difficult. Maybe I test-drive Hoary again on my notebooks 10Gig spare partition.
The stable warthy failed the installation due to a faulty sata-driver. Maybe the ubuntu-guys got it fixed in hoary.
Well,… we will see.


Feb 1

During the Weekend I was able to test out the net-code inside OpenTTD. The multiplayer is really neat. You can play versus, cooperative and in teams. Initially we played with each player owning a company, then moved over to team-mode with 3 players in a company. I’m curious about playing 2vs2 or 3vs3 or something like that.


Jan 30

In the wake of gettin back into Transport Tycoon Deluxe I found this.
Edit: Additional I found this to play all that stuff. Really Neat.

Got to get my midi chip working.


Jan 20

Today I’m going to clean up my code for a wizard-framework developed in Java. At the moment the code and design is somewhat messy. I think I will change some of the overall design into a factory-method-pattern and take a look at the result.

At the moment the API requires the using coder to inherit and instanciate two types of classes for each page
of the wizard.

With the factory-method-pattern this could be reduced to just one, by pushing one page-class inside the wizard itself and making it accessibe by a factory-method, so in the end the coder just has to instanciate and build his pages. Then he just has to add them to the wizard, et voilá.

Parallel to this I invented a similar framework for a preferences-dialogue. Maybe I can use the same shit to simplify some things her too.


Jan 18

The Gentoo-Community is amazing. Every question is asked and answered in a most professional way. You almost never have to ask yourself, as someone probably did it before. Just enter the right search-string and get an answer.

Sometimes I question the gentoo-concept of building everything yourself, but the community is really worth all the effort.

Btw: The problem I had was a missing libstdc++ during a compile, the answer was:

fix_libtool_files.sh 3.3.4

and I found the answer here.
Just Great :-D