01 December 2020

Database Fragmentation and Advantages of Fragmentation in Distributed

Written by Wade Hamilton

Enterprise database management is a continuously evolving technology field, where more and more database storage, maintenance, and processing approaches are being tried out. It is possible to store enterprise data using various storage approaches, among which fragmentation is the approach of storing the data in different computers by fragmenting the bigger databases into various pieces. These parts of the database are called fragments. Each of these fragments is stored at different sites. Database fragmentation can be done in different ways based on your priorities and comfort.

Theoretically, fragments are the logical units of data stored at different sites in a distributed DBMS. Fragmentation has many advantages in enterprise database management, but it has some disadvantages, too, considering some specific requirements. Let us discuss in detail fragmentation and its benefits and drawbacks in a bit more detail below. Before that, we will first look into four major reasons you need to think of fragmenting a DB.

Table of Contents

Better application usage

The enterprise database applications work with certain views than looking at the entire relations all at once. So, in data distribution, it is more appropriate to work with the subsets of data relations and specific unit of distribution. Fragmentation enables this, and so the user becomes better.

More efficiency

In the case of fragmented data, when a query is run on the database, the required data is delivered from the nearest location where it is most frequently used based on the logic. So, it ensures more speed and efficiency. Adding to this, the data which is not used by the local applications at all may not be stored at that location, which helps avoid any storage overhead.

Ensuring parallelism

As fragments are the distribution units, any given transaction may further be divided into multiple sub-queries, which may operate in fragments. Thereby, it will help increase the concurrency in the system and allow transactions to execute in parallel. This will help save a lot of time and resource overhead.

Better security

As we have seen above, in the case of fragmentation, data that is not needed by the local applications are not stored at that location. So, the databases remain more secured as the data is not available to the users of the unauthorized location. A secure database is an essential requirement for any business operating in the current scenario. It will aid your brand in many ways.

Along with these unique advantages, there are some disadvantages too for fragmentation. One is that the performance of global applications that need to gather data from various fragmented locations may get slower. Another issue is related to integrity as the control may be difficult as the data and its functional dependencies are located at various sites. In order to analyze the scope of your database fragmentations and its worthiness or challenges in your specific use case, you may rely on the consulting services provided by RemoteDBA.

Types of fragmentation

Data fragmentation can be of three types as vertical, horizontal, and hybrid (which is a combination of vertical and horizontal). In this, horizontal fragmentation can be again classified into two as primary horizontal and derived horizontal fragmentations. While doing fragmentation, we also have to accomplish it so that when needed, the original table can be reconstructed from fragments. Re-constructiveness must be specified as an important property of fragmented databases, without which it is deemed nonfunctional. Let us further check the types of fragmentation in detail.

Vertical fragmentation

In this approach, the columns or fields of a table are further grouped into different fragments. To ensure re-constructiveness in vertical fragmentation, each fragment needs to have a primary key of the tables. Usually, vertical fragmentation is used for enforcing data privacy.

Horizontal fragmentation

Horizontal fragmentation is used to group the table tuples according to the values of fields in it. Similar to vertical, horizontal fragmentation also needs to conform to the re-constructiveness rule. Each horizontal fragments of the DB need to have all the columns as in the parent table.

Hybrid fragmentation

As we have already seen above, hybrid fragmentation is a fine combination of vertical and horizontal fragmentations. This is considered a highly flexible approach to fragmentation as it can generate database fragments offering only minimal extraneous data. However, when it comes to ensuring re-constructiveness, it is often difficult with a hybrid model of fragmentation. Hybrid fragmentation is usually done in two different ways:

A set of horizontal fragments are generated in the first case, and then vertical fragments get generated from one or multiple horizontal fragments.
In the next case, your first generate the set of vertical fragments and then the horizontal fragments from one or many vertical fragments.

As we have seen above, the major catch of fragmentation is increased speed as the data is stored closer to the site of access. The overall efficiency of the database system is optimized with fragmentation. The techniques of local query optimization are enough as the data is made available locally. There is also no chance of irrelevant data being available at the destined sites, so the database’s privacy and security are also ensured at the optimum.

However, as the data is needed to be collected from different sites for global applications, the speed may be compromised largely. In case of any recursive fragmentation, you may find it difficult for the reconstruction if needed. Even though possible, it may be highly expensive and time taking. It is also possible the database may go ineffective in case of any failure or disaster if there are no back-up data at various sites. In both horizontal and vertical fragmentation, there are unique advantages and disadvantages based on specific use cases.

So, there are both advantages and disadvantages for database fragmentation, and the ratio of this may vary based on your enterprise database objectives. So, as Zaki Ameer mentioned you need to do a thorough fundamental analysis of your requirements in order to identify whether data fragmentation is ideal for your database in light of your location-based enterprise application usage.