Programming Concepts And Algorithms: June 2014

Sunday 22 June 2014

The Tower Of Hanoi

The Tower Of Hanoi problem is perhaps the most frequently asked recursion-based problem in interviews and examinations. Perhaps the most succinct explanation of this problem can be found in Wikipedia here .

PROBLEM STATEMENT :
A set of n disks are kept on a peg A such that each disk lies on top of a disk larger in size than it. This set of disks is to be moved to another peg B; always minding the consistency that a smaller disk lies on a larger disk. We can use an auxiliary peg C for our aid.

ALGORITHM :
We use recursion to solve the problem. Consider an auxiliary function moveTower that accepts the parameters sourceTower, destTower, auxTower and numberOfDisks. This function moves numberOfDisks disks (<= n) from the sourceTower to the destTower using the auxTower. We follow the recursive idea that in order to move n disks from A to B using C as auxiliary, we move the top n-1 disks from A to C using auxiliary and then move the remaining 1 disk (the largest disk) from A to B. The problem now shortens to moving n-1 disks from C to B using A as auxiliary. Thus we see that after each iteration, the problem gets smaller and thus there will be a terminating condition. The terminating condition of the recursion is that n becomes 1. 1 disk can be moved in 1 move.

Algorithm moveTower (sourceTower, destTower, auxTower, numberOfDisks)

if (numberOfDisks == 1) then move disk from sourceTower to destTower and exit //terminating condition
else{
moveTower(sourceTower, auxTower, destTower, numberOfDisks - 1)
moveTower(sourceTower, destTower, auxTower, 1)
moveTower(auxTower, destTower, sourceTower, numberOfDisks-1)
} if - else ends

The java code that simulates the Tower Of Hanoi problem is as follows:

ANALYSIS :
If T(n) is the time required to transfer n disks from sourceTower to destTower, then by the above algorithm, T(n) is distributed into 2 transfers of n-1 disks and a transfer of 1 disk. Thus,
T(n) = 2T(n-1) + T(1)
But how many moves do we require to transfer n disks from A to B? We can prove by mathematical induction that the minimum number of moves required are 2ⁿ - 1. The 3 steps of mathematical induction are :

Basic Case : For n = 1 i.e if there is only 1 disk, then we directly transfer the disk from A to B in 1 move. 2¹ - 1 = 1. Hence, the equation holds true for the base case.
Assumption : We now assume that the equation holds true for all integers 1 to n-1.
Induction : Since the relation holds true for integers from 1 to n-1, we now check the relation for n number of disks. To transfer n disks from A to B, we first move n-1 disks from A to C in 2^n-1 - 1 moves, then transfer 1 disk from A to B in 1 move and lastly move n-1 disks from C to B in 2^n-1 - 1 moves. Thus, the total number of moves are 2^n-1 - 1 + 2^n-1 - 1 + 1 = 2*2^n-1 - 1 =
2ⁿ - 1. Thus, we have "induced" the relation to all integers.

The Tower Of Hanoi problem is the best example where the advantages of using a recursive method far outweighs its disadvantages. While recursion may not be advisable in cases of problems like the Fibonacci series because of the overhead in calling a function recursively and maintaining stacks of duplicate variables, recursion proves to be a boon in Tower Of Hanoi and is, indeed, in many aspects better than the iterative method.

Saturday 21 June 2014

Parentheses' Check

PROBLEM STATEMENT :
A mathematical expression will be inputted as a string. We need to check where the mathematical string is properly parenthesized. This includes checking for 2 conditions :

Every opening parenthesis should have a similar closing parenthesis.
The order of the opening and closing of parentheses should be correct i.e. the parenthesis to open last should be closed first.

A correct expression would be ( { [x+y*z] [x+y] } )
An incorrect expression would be something like { [x+y} ]

NOTE : We are not interested in the variable expression within the parentheses. As long the expression is correctly parenthesized, we need not encumber ourselves with the underlying variables and their relations.

ALGORITHM :
For rule no.1, we need to establish concurrency between the symbols i.e. we need to have a '}' for every '{' only. '}' doesn't correspond to '['. And since according to rule no.2, the last parenthesis to be opened should be the first to be closed, we use a LIFO kind of data structure: a stack.

We scan the expression from left to right character by character. Whenever we encounter an opening parenthesis, we push it into a stack. Whenever we find a closing parenthesis, we compare it with the top of the stack and if the correspondence relation holds true, we pop the stack. We ignore all other characters.

The java code that follows the above algorithm is given in the form of a Gist below :
Each of the routines push, pop and checkForParentheses takes O(1) time. Thus, for a n-character long string, the main function takes O(n) time, ignoring the overheads for calling a function.

Thus we see another important application of the data structure stack. Parentheses check is an important routine in expression-checking as well as expression-evaluation.

Sunday 15 June 2014

The 3SUM Problem

Another very frequently-visited problem is "The 3SUM Problem", sometimes also known as the Subset Sum Problem This problem lays path to many concepts in Algorithmic Science such as sub-quadratic time problems, 3SUM hard problems, etc.

INPUT :
A sequence (taken in as an array A) of n distinct integers, preferably whose absolute value (modulus) can be given by a polynomial p(n).

OUTPUT :
The combinations of 3 integers from the sequence whose sum equals 0 i.e. the numbers i, j and k such that i+j+k = 0. Note that the integers i, j, k have to be distinct and the combinations themselves have to be distinct. For example, i,j,k and k,j,i represent the same combination and therefore, are not distinct.

THE "BRUTE FORCE" METHOD :
The obvious and perhaps the most trivial method would be to scan the array 3 times in one direction (to ensure distinctness of the combinations) and take every possible combination of 3 integers and check whether their sum is 0. For an n-input array, the number of combinations will be ⁿC₃ = O(n³). The following algorithm is an illustration of the above method.

Algorithm 3SUM (A, n)

for i = 0 upto (n-2)
for j = i+1 upto (n-1)
for k = j+1 upto n
if (A[i] + A[j] + A[k] == 0) then print A[i], A[j], A[k]

Thus, we see that the algorithm is nothing but 3 nested for loops giving a performance of O(n³). Can we do any better?

THE BINARY-SEARCH METHOD :
This method uses the idea that instead of traversing the array for 3 elements requiring 3 nested for loops, we can traverse the array only 2 times using the following simple manipulation:
if x+y+z = 0, then z = - (x+y)
Thus, if we choose a particular "x" and a "y", we just need to search the array for the "z". This searching can be done by an efficient search technique, The Binary Search Technique. This technique, however requires us to first sort the array, hence that will be our first step.

We assume a routine binarySearch(Array, low, high, key) which searches for "key" within the bounds "low" and "high" and returns the index of the key, if found or if not found; returns -1.

Algorithm 3SUM PLUS (A, n)

sort(A)
for i = 0 upto n
for j = i+1 upto (n-1)
k = binarySearch(A, j, n, -(A[i] + A[j]) )
if(k != -1) then print A[i], A[j], k

The if statement is executed in O(1). We know that binarySearch has a performance of O(log₂n). However, it is called within 2 nested for loops i.e. ⁿC₂ = O(n²) times, thus giving the routine 3SUM PLUS a performance of O(n²*log₂n). Note that the method used for sorting can be anything from insertion (O(n²)) to merge sort (O(n*log₂n)) as they are both asymptotically smaller than O(n²*log₂n). However, can we do still better? (obviously we can, or else this question wouldnt be here!)

THE HASH MAP METHOD :
We continue with the above idea of traversing the array only twice instead of 3 times. We will prove later that the sorting technique used doesn't play a role in the final performance, so the only scope of improvement is in the search technique. We need a search technique faster than the Binary Search. Hash-based searching is an example. Searches performed using a hash table take O(1) if we know the hash index or the function that gives the hash index. Therefore, we will copy the entire array into a hash table and then inside the nested for loop, we search the hash table for the integer z = - (x+y).

Algorithm 3SUM PLUS PLUS (A, n)

sort(A)
for i = 0 upto n
hash.put(<unique key determined by a hash function>, A[i])
for ends
for i = 0 upto n
for j = i+1 upto (n-1)
k = hash.search(- (A[i] + A[j]) )
if (k != null) then print A[i], A[j], k
nested for ends

Here again, we have 2 for loops nested giving O(n²). Thus again, the method used in sorting doesn't affect the overall performance. 3SUM PLUS PLUS routine takes O(n²) which is asymptotically better than O(n²*log₂n) and O(n³).

Wikipedia offers another interesting method which also gives O(n²) but does not use hash tables. Go, have a look here

FUN FACT : Although it hasn't been successfully proved that the 3SUM problem can't be solved faster than O(n²), no better algorithm has yet been found. Do you have it in you to find it? Leave us a comment!