Skip to content

C do’s and don’ts (mostly don’ts)

This past Friday, I received a surprising email, from a former co-worker, it’s been a year or more since I left that particular company (wont name, names). We were in different departments, but were by fate put on the same project.

This particular project was (is) of the utmost importance for that company, and in his development team he’s the only one that was working on this specific application (a server). To the suffering of all the involved in his department this particular piece of code had to be written in C, and there was just no way around it for several technical and design reasons that are not important at this moment.

I was in charge of testing his application (not unit testing), I had to performance test his application, to do this, one needs to know the architecture, and all that the developer needs to know to develop it, and more. I don’t have to tell you that this was a painful experience that mostly drove me crazy.

Just seeing who the email was from and subject surprised me to no end. As it was, they were having some issues, more to the point he was having some issues –as we were friendly, which basically means he was the only person I didn’t tell to f..k off. He asked for my number which I gave to him, and a few minutes later, his boss (whom I would have gladly frag’d) was on the line, telling me, how my NDA still applies and what not… at this point I just calmly stated you called me. What do you want?

The server has been having issues, and they have to reboot it every 72 hours or so… at this moment they have put the old code base (which was the one I worked with) back in production, but need the new functionality ASAP, and basically wanted me to check it out. Now, I was curious, I have that problem, I’m too curious so I said yes, sure.

They sent me the vmstat of the last 72 hours of the server, along with several log files, and after a cursory glance, it was obvious there were memory leaks. This they already knew (they are not dumb) but couldn’t find the spots where they were happening, and it’s here where the story gets interesting and where we might learn a few things, so this doesn’t happen to you.

First a disclaimer, this particular developer is quite good, and diligent. Also I’m not as good or diligent, and I don’t know much more than anybody else, but I have had the grand advantage of working with people that do know more than me, and that are willing to show me, so I learn and get better, this is my way of giving back, and saying thanks to all of them.

Malloc

The first issue when I saw the code, was the scheme he was using for memory allocation. I believe that with regards to memory allocation, if there is a failure allocating memory, aborting right then and there is the right thing to do, of course there are exceptions, but for most situations this is ideal, a well done policy, is also something to look into, for this type of scenarios.

His scheme was like this:

  • Allocate a good chunk of memory, that will be shared among threads.
  • Each thread allocates the memory needed to do it’s operation, write the results to the shared chunk, and clean after itself.

I wouldn’t do it this way, but this scheme works as long as everyone plays nice. As it happens all threads are playing nice, the problem lies with the allocation done internally by the threads.

He is over writing the pointer that malloc returns, and while he is freeing it, the portion returned by malloc is never free(), this goes on for all the threads.

Some of you might be thinking What? Over writing? How, why? Well the thing is that he didn’t noticed or know he was doing it, which is not good, but it’s a common mistake with C programmers (newbies, and not so newbies).

Another nice little side effect, is that every time, a thread wrote to the shared memory chunk, this was sent out to the appropriate interfacing systems, upon a successful completion, the memory chunk was de-allocated and the re-allocated, meaning he free() and then malloc() again and again. Which also brought the nifty problem, that if a thread was waiting in place to write to the chunk, the reference it had to the chunk had gone bye bye, so to avoid a segmentation fault, he worked around this problem in a very creatively unnecessary way (that I would rather soon forget).

Don’ts:

  • Do not over write (change where it points) the pointer returned by malloc(), don’t do it ever.
  • Wrap your malloc() call in a way that it checks (once is enough) no out of memory errors, if there is exit immediately.
  • Don’t blindly accept that the arguments to your methods are what you are expecting, or valid data.
  • When returning an error state from a function, for God’s sake, don’t abuse NULL!

Do’s:

  • Learn how to use lint.
  • Use lint often.
  • Run lint, after every compilation, add it to the build.
  • Check and double check all of lint’s warning and errors, fix them.
  • Run lint again after fixing the errors.
  • Read your code, go thru it in your head, or which ever way you prefer.

I’m in agreement with Thomas Ptacek. If my friend would’ve done that, this problem would have surfaced in the pre-production stages, because the system would have crashed earlier, enabling them to find the culprit much sooner.

There are other things that contributed to the problems, but the good news is after an hour of going thru the code, the system is back up, and it’s been over 72 hours and it hasn’t crashed, and it’s memory consumption is minimal as it should’ve been from the start.

Update:

Forgot to mention I provided them with a nice little Ring buffer implementation that captures the steps/states of the threads, from beginning to end, in case of an error, it dumps it’s content to disk.

That way they will be able to pin point more or less what triggered the crash, where in the code this happen, and the time this happened.

For more info on ring buffers, check out this links:

2 Comments

  1. So your NDA still applies? Is mine still valid too? My, oh my! We’re screwed! :)

    Posted on 29-Apr-08 at 10:37 am | Permalink
  2. Read your contract. You still have it right?

    Posted on 06-May-08 at 7:43 pm | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*