How Deduplication Works?
Deduplication works by examining the data stream as it arrives at the storage appliance, checking for blocks of data that are identical and eliminating redundant copies. If duplicate data is found, a pointer is established to the original set of data as opposed to actually storing the duplicate blocks, removing or "deduplicating" the redundant blocks from the volume. The key here is that the data deduplication is being done at the block level to remove far more redundant data than deduplication done at the file level where only duplicate files are removed.
Data deduplication is especially powerful when it is applied to backup, since most backup data sets have a great deal of redundancy. The amount of redundancy will depend on the type of data being backed up, the backup methodology and the length of time the data is retained.
Example. Backing up a large customer database that gets updated with new orders throughout the day. With the typical backup application you would normally have to back up, and store the entire database. Even incremental backups will result in storing the full database to disk once again, taking up increasing amounts of disk space with almost identical backup data sets. With block-level deduplication, you can back up the same database to the device on two successive nights and, due to its ability to identify redundant blocks, only the blocks that have changed will be stored. All the redundant data will have pointers established.