Find out what are data structures and why are they so important. Understand the basic concepts of different data types with the help of examples.
‘Data structures’ is one of the most fundamental subjects in the field of computer science. It is imperative to have an in-depth knowledge of the subject, especially when you are in the programming / developmental domain. In this domain, you need to build efficient software and applications, and here, the in-depth knowledge of Data structures comes handy. But what is data structures? Why are Data structures so important? What are its applications, and what are the types of Data structures? Let’s find out!
What are Data Structures?
By definition, “In computer science, a data structure is a data organization, management and storage format that enables efficient access and modification.” Now let us talk about some big market players in the field of Information Technology- Google, Microsoft and Apple. These companies have terabytes of data at their disposal, which is accessed every second. These data have to be managed efficiently, and data structures come handy. Suppose we cannot locate a particular data among all these data in a short span of time, then there is absolutely no use of handling such a large amount of data right? This is why we need Data structures.
In simple words, Data structures can be defined as “the way in which data is stored on computer” or “organizing the data in memory.” Data structures are just a set of rules which tells us how to store data. Apart from making the data access efficient, Data structures also reduce the complexity in finding and modifying data. Arrays are the most basic example of Data structures, and we will discuss them later on in the same article.
Why data structures?
Data structures, as discussed above, is one of the most fundamental subjects of computer science. This is because it presents some significant objectives, such as:
- To identify and create useful mathematical entities and operations for the problem.
- It determines the representation of entities and implementation of problems.
- The data structure is also useful for data abstraction (which means hiding the data from unauthorized or third party sources) and analyzes the problem step by step. Apart from that, another important use of (perhaps the most important one) Data structures is to develop algorithms which can be used to find data.
Examples of the importance of Data structures
Let us talk about a dictionary. It is something almost everyone has come across once in a lifetime. Many of us still refer to dictionary even after the introduction of easily accessible internet and online dictionary. A good dictionary generally has tens of thousands of words with their meaning written over a thousand pages! Let us suppose we have to find the definition of a word ‘reputation’. How are you going to find out the meaning?
Firstly, you will open the page which has all the words starting with “re” and then search your word ‘reputation’ which is set in alphabetical order. This makes the search for your desired word much easier! So all thanks to the alphabetical order of the word set by the editors and publishers of the dictionary! So the publishers or the editors have structured or organized the words (or data) in a defined order.
Now let us suppose there is no such organization or arrangement of words in a recognized pattern. Word meanings are arranged in any random way! So how long do you think it will take to find the meaning of your desired word ‘reputation’? A couple of hours? Days? You never know!
This is a real-life example explaining the importance of arranging data in a systematic way.
Digital World Example
Let us take an example of Google Maps. I hope every one of us is aware of this fantastic GPS enabled application which helps us to track and navigate any location in the world! So how are we able to track these locations? This is because all the locations are arranged in a systematic way against their X and Y coordinates. So when we search for any particular location, let us say, ‘White House, USA’. The application will show you the navigation and location according to the coordinates stored in its system, that is, 38.8977° N, 77.0365° W. So all this is possible because of Data Structures! The programmers behind Google Maps were able to store the coordinates of the White House systematically and enabled you to access it.
Implementation of Data Structures
Data structures are basically dependent on the computer’s ability to obtain or store data at any place in the memory. These data can be specified by a pointer- which is a bit string and represents a memory address, which can be stored and modified by the program. The array and record data structures are concerned about computing the locations of the data items with various arithmetic operations. And, linked data structures are concerned with storing addresses or locations of data items inside the structure or the system itself.
In order to execute a data structure, the programmer should generally write a set of procedures which have the ability to create and modify instances of that particular system or the structure. One cannot analyze the competence of a data structure separately from the operations carried out.
What are the data types?
By definition, a data type is “an attribute of data which tells the compiler or the interpreter how the programmer intends to use the given data.” They are basically of the following types:
- Primitive: They are the basic building block of the data. (for example, Boolean, integer, float, char etc.)
- Composite: Any data type (e.g. array, string, struct) which is composed of primitives or composite types comes under this category. Let us talk about string in this category. It describes a kind of data structure which has a sequence of char characters and is referred to as being a composite type. However, the primary implementation of the string composite type is generally executed using array data structures.
- Abstract: This is the data type which is identified by its behaviour—for example, set, stack, graph, queue etc. The abstract data type (ABT) tells us how the data, associated with accurate data, is expected to behave.
Primitive and Non-Primitive Data Structures
We have different types of data structures, and they are classified in different ways. Let us talk about the main classification in this section: Primitive and Non-primitive data structure.
Primitive Data Structures
Primitive Data Structure is supported at the machine level. They can be used to make all the non-primitive data structures. These kinds of data structures are in necessary forms and have their behaviour and specifications pre-defined.
The examples of primitive data structures include integer, pointer and character.
It is worth noting that the pointers hold memory addresses or location of the data instead of holding the data value. Pointers are referred to as ‘reference’ data types. The task of keeping the data value lies in integers and characters.
Non-primitive Data Structures
Non-primitive data structures are derived from their primitive counterparts. The former type of data structures have no meaning and cannot be performed without the later.
The non-primitive data structures are further divided into the following categories:
Let us discuss these types one by one in detail:
An array is a limited number of data which is allocated contiguous (which share the same border) memory locations. Each element within this data type is accessed through typically a numerical and zero-based digit, which can also be called an index key. We can even say that an array is a homogeneous and contiguous collection of the same data type. The memory allocation technique is static meaning that once they are allocated memory space, it cannot be altered during run time. They are used to execute different data structures like vectors, matrices etc. Inserting and deleting data in array are generally complex since the elements are stored in consecutive memory allocations.
We discussed above that arrays are fundamentally finite in size/length (finite memory capacity). But there is still a concept of dynamic arrays (you must have come across with this term if you are dealing with high-level programming language). Dynamic Arrays can grow and resize to add more and more elements to it. But we should know that resizing arrays are quite complex and expensive as it includes copying the array to the exact amount among many other steps.
A file is a collection of various records (or data). This type of data structure is generally used to maintain or manage a massive amount of data which is not included in the primary storage of the system. Files enable us to operate, process, access, retrieve and even modify the data efficiently and quickly.
The lists type of data structures supports dynamic memory allocation. One can change the allocated memory space even in run time. Generally, Lists are classified into the following different categories:
- Linear lists
- Non- Linear lists
A linear list is the one in which the elements are stored in sequential order. Data insertion and deletion become easy in such data structures. They are further divided into two types:
Non-linear lists do not have elements stored in a simple structured manner. They are generally:
- Graphs (to represent or study networks)
- Trees (consisting of nodes connected in a particular arrangement which makes search operations and data retrieval easy)
We will discuss more these types in the latter part of the article.
Difference between Linear and Non-Linear Data Structure
As we know that data structure is a way of storing and logically implementing the data elements. The storage of data elements should be systematic in an orderly manner. Both the major types of data structures, primitive and non-primitive data structures are crucial things to know while programming. They play a significant role in storing and executing the data. The non-primitive data can be classified into two ways
|Linear data types||Non-Linear Data types|
|One can easily retrieve the data elements as they can be easily accessed.||Since the data elements in the case of non-linear data types share an inflexible relationship among each of them, it is quite difficult to retrieve or store data.|
|One can store or retrieve elements adjacent to each other. This is the reason why the retrieval of data in linear data types is easy.||Because of the strict relationship between the elements of this data type, it is not that easy to retrieve the data or even store it. However, one can do that by following some pre-defined patterns which define who the data is supposed to be added in that particular data type.|
|The data elements can be accessed in one run.||It is not possible to access the data elements in one run.|
|They are simple and easy to work with and understand the concept.||They are quite complicated, and working on these data types can prove to be quite tricky. You may require a sound knowledge of various fields like statistics and mathematics.|
|One of the disadvantages of linear data types is that memory utilization is inefficient.||The memory utilization is efficient in this case.|
|For example, Linked list, Stack, Queue etc||For example, Trees, Graphs etc|
Tree data type
The tree is a type of non-linear data type which can be represented by a hierarchal tree structure. Here the root values (tree) are linked with the subtrees (children) through a set of different linked nodes.
Each tree contains ‘nodes’ each of which has an associated value and is connected by a linear line which is called an ‘edge’. These lines have the function of representing the relationship between the nodes.
The topmost node is ‘root’, and any node with no children is called ‘leaf’. If in a case, when a node is connected with another node, the preceding node is referred to as ‘parent’ and the succeeding node (s) are called ‘child/children’.
The various types of the tree structure as follows:
- Binary Tree
- Binary Search Tree
- Red-Black Tree
- Weight-balanced Tree
- Abstract Syntax Tree
A graph is an abstract data type (ABT) which guides the execution of data structure by following the principles of graph theory. The Graph Data Structure is a type of non-linear data structure and consist of:
- Nodes: The different points on a graph. (This is also known as vertices)
- Edges: The lines which connect each node to the other.
I hope you found this guide useful. If so, do share it with others who are willing to learn about the different topics that we publish here on our blog. If you have any questions related to this article, feel free to ask us in the comments section.
And do not forget to subscribe to WTMatter!