How Oracle database does instance recovery after failures?


Oracle Database Recovery



INSTANCE RECOVERY – Oracle database have inherit feature of recovering from instance failures automatically. SMON is the background process which plays a key role in making this possible. Though this is an automatic process that runs after the instance faces a failure, it is very important for every DBAto understand how is it made possible. In this blog, we are going to understand the same with the help of a flow diagram and a demonstration.
Instance recovery is the way of bringing back the transactions which are not yet written to the disk by DBWR at the moment of instance failure. Instance failures are possible in multiple ways: hardware issues, power glitches, network outages, abnormal instance shutdowns (SHUTDOWN ABORT). Database startup after any of these failures will automatically trigger the instance recovery. Let us now get a clear understanding of this process with the help of the flow diagram and with hands-on activity as well. This exercise would similar to the one we did in “Why should we configure limits.conf for Oracle database?” blog.
Recovery concepts:
27_2
Consider the image above, let us get into details of this flow.
  1. At 9:00 AM consider your database is working normally.
  2. At 9:10 AM think of database running 30 transactions in which 20 are committed and 10 are uncommitted, they are yet to be decided by user process to commit or rollback.
  3. While this transaction is still running, let us consider that instance is terminated at 9:11 AM.
  4. Now as we bring up the instance, a perfect recovery is the one that makes sure 20 transactions are not lost and 10 uncommitted transactions are lost. Oracle does it perfectly. Hoping so, startup up the instance normally.
  5. Once database mounts, then the initial phases of recovery begin.
  6. With the help of CKPT background process, instance recognizes any loss of transactions with the compatibility checks of SCN numbers between control files and data files.
  7. All the transactions which are lost without updated to the datafiles are re-ran with the help of REDOWhich means that all 30 transactions are ran as on 9:10 AM now. This action is called as ROLL FORWARD.
  8. Database then checks the accessibility of UNDO segments and opens the database.
  9. On opening the database, now the 10 transactions which are not committed are rolled back making the database consistent with 20 committed transactions with the help of UNDO This action is called as ROLL BACK.
  10. Now that instance recovery is completed, Database is not recovered with 20 committed transactions. But yes, blocks would be updated into datafiles asynchronously by DBWR as usual.
We will now see the demonstration of the same to relate it with the reality.
  • Oracle database version used: 12.1.0.2
  • Host machine: OEL
  • Database: NON-CDB
  1. Database is working normally now.
Transactions are always running the database in the SYSTEM and SYSAUX tablesapces. We do not need to run any as of now.
  1. Let us terminate the instance now.
  1. Let us now startup the database and monitor the alert log file. Below is the important output of the alert log file.

Alert log clearly says that REDO has been applied with the transactions which are not committed when instance terminated.
  1. As soon as the instance is opened, roll back operation is performed and below is the output of the same.

Comments