Compilation stages and GCC command utility in-depth.8 min read

This tutorial is very lucid and informative. It contains many questions with solutions. such as, why assembly instructions are lower than C? Do you know the gcc compiler can be accessed other than gcc command? How to access output files generated by each compilation stages. and many more. Moreover, this tutorial is containing in a detailed explanation of the individual compilation stage.

इस दस्तावेज़ को हिंदी में पढ़ने के लिए यहाँ क्लिक करें!

Steps involved in C program execution.

  1. Program Creation
  2. Program Compilation
  3. Program Execution

Program Creation.

Program creation means, where we are writing source code.

In Linux, there are multiple options for editors. Such as vim, nano, Geany, Sublime text editor, Eclipse, and many. many more.

Out of such great editor options, we are going to use vim editor. in the previous lecture, we already learned about vim editor in-depth with all necessary commands. For revision of vim click here. click here.


Program Compilation.

After program creation, the next step is program compilation. In which compiler will change the user-readable (text) code into a machine-readable (Binary) code.

For compilation, we are having an inbuilt compiler. Which is known as GCC.

Extra info: 

The terminal is having two modes.

  1. ‘$’: Regular user.
  2. ‘#’: Root user.

In Windows, it is usually referred to as an Administration mode. While in Linux/UNIX it is called Root.

gcc command

gcc is a “GNU compiler collection”. which includes C, C++, Objective C, and many more languages in it. you can read more about it here!

In short, gcc is used to compile the file.

NOTE: gcc can also be used as a cc command.

// This is a prog.c file created in vim editor.

#include <stdio.h>

int main(void)
{
     printf("Welcome to Electronics1010.com \n");

     return 0;
}

Syntex:

$ gcc <file name>

Eg:

$ gcc prog.c

Or

$ cc <file name>

Eg:

$ cc prog.c

After writing this command on terminal click enter.


Program execution

When gcc command is performed on prog.c. it creates an executable file. which is a.out file.

NOTE: By default, it produces an “a.out” executable file.

a.out contains the binary equivalent of a file what we gave (here prog.c)

compiling c file and giving executable file

For every gcc compilation, we will get an executable file named a.out. But it replaces the previous file in that folder. In short, that folder will contain only one executable file.

multiple time execution of gcc will produce same named executabled file

NOTE: Almost all commands are executable files.

Eg.

  1. date  :   Which is present in the bin folder.
  2. cal    :   Which is in the usr folder.
date and cal command execution in linux

Now let’s try to execute our newly build the executable file.

$ a.out
direct execution of a.out is  not possible  it will throw an error

oh man !!! terminal thrown an error called “command not found: a.out”

But why? Reason is that the execution file is not having a path attached to it. So, let’s try it again by adding a current directory path in starting of a.out

/* in my case path looks like as given below but you can check your current directory path by using the pwd command. */

$ /Users/--------/Documents/Tutorial/a.out
execution with path

Every time writing a path is a very tedious task. As a solution to the current directory path, we can use “.”


NOTE: “.” means link to itself (current folder path) and “..” means link to a parent ( path of one folder back).

We can execute a command from any directory as shown below.

$ ./a.out
file execution by the best way

What if we want an executable file with a custom name?

For that in gcc command “-o” key is used.

NOTE: -o (it is Hyphan o)(English alphabet small o)

$ gcc <file name> -o <executable file name . <extension if want>>

Eg.

$ gcc prog.c -o prog.exe
$ gcc prog.c -o prog

Compilation stages of C language

Every program undergoes the following process.

Compilation stage of C language with keys and commands

Our source code is as shown in below image. File’s name is “prog.c”.

source code for example

Preprocessor stage

The source code we wrote in the text editor is given as an input, its file extension is “.c”.

Output of preprocessor is having “.i” extension. Which is pure C code.

Pure c code: It is the code which is going for translation the code written by developer (test.c)

The preprocessor expands the header file of the source code and sends it to the translator.

Preprocessor is having key named “-E”.

Syntex:

$ gcc -E <input file with .c extension> -o <output file with .i extension>

Eg.

$ gcc -E prog.c -o prog.i
expanded c code by preprocessor

This is how extracted source code looks like, Preprocessor extracts header files and performs all macro operations so that’s way size of file also more. The above image is “prog.i” file.


Translator / compiler stage

Translator, translates the given pure c code into the machine’s assembly language.

Input file extension of translator is “.i” and it generates a file with an extension of “.s”

Translator is having key named “-S”.

Syntax:

$ gcc -S <input file with .i extension> -o <output file with .s extension>

Eg.

$ gcc -S prog.i -o prog.s
assembly code generated by compiler/ translator

This is how translator translate the code in assembly language. The above image is “prog.s” file.


Assembler stage

This assembly language code generated by translator is converted into object file by assembler.

In DOS system object file has “.obj” extension and in UNIX system object file extension is “.o”

Input file extension is “.s” and output file extension is “.o”

Assembler is having key named “-c”.

Syntex:

$ gcc -c <input file with .s extension> -o <output file with .o extension>

Eg.

$ gcc -c prog.s -o prog.o
assembler converts code into object and how its looks like.

This is how an object file looks like. Above image is “prog.o” file.


Linker stage

Generally all programs written in C language uses library functions.

Library files are precompiled and their object code is stored in library files with “.lib” or “.a” extension.

So, the task of a linker is to combine the object files of our source code with the object code of used library file and other used functions.

Input of linker is object file and output file is an executable file.

In the DOS system, name of an executable file is “source code filename” with an extension of “.exe”. But in UNIX executable file name is “a.out”.

Linker is not having any key.

Syntex:

$ gcc <input file with .o extension> -o <filename as per user want>

Eg.

$ gcc prog.o -o prog
this is how executable file looks like

This is how an executable file looks like. Above image is prog executable file.


compilation stages with result

This is final result after doing individual stage based compilation.

IMP Question:

Assembly instructions are lower level than C. Why?

Solution: Assembly language is machine specific. So, assembly language is closer to machine language. ( Machine cannot understand assembly language just it is close to machine language )


Assignment

  1. Revise the basics of Compilation stages. Click here!

Leave a Comment