Tech Savy: June 2007

Friday, June 29, 2007

Speed Switch v/s if Else

Speed Test: Switch vs If-Else-If
The .NET framework and the C# language provide two methods for conditional processing where multiple discrete values can be selected from. The switch statement is less flexible than the if-else-if ladder but is generally considered to be more efficient.

Test Description
Purpose
This test was designed to compare the execution time of the switch statement and the if-else-if ladder code structure when used to select a match from a discrete series of values. The if-else-if ladder provides much greater flexibility than is used by the test but, when being used to provide matching functionality to switch, this does not need to be considered. The two conditional processing structures are described in the article, C# Program Flow Control: Conditional Processing.

Test Process Functionality
It was important that the speed test for the two statements provided exactly the same functionality in both cases. For each test, an integer value between zero and nineteen was generated. This integer value was then compared against nineteen possible selections using if commands and case statements. Where the integer value generated was nineteen, this was not matched and instead processed using the final else statement or the switch command's default case.

In many programming situations, a selection of this kind is very likely to generate a match. However, for completeness, a second test was performed. This test followed the same functionality but using an integer value that did not match any of the possible values; it therefore always used the final else statement or the switch command's default case.

Looping
The speed of execution of a large if-else-if ladder or select statement is too fast to me accurately measured. To make measuring possible and to reduce anomalies, a loop was constructed and the test code executed repeatedly. For each test, the loop completed one billion (1,000,000,000) iterations.

The looping function for the first test was used to generate the integer value to match. For each iteration in a for loop, the loop control variable was used with the modulus operator to create a comparison value between zero and nineteen.

In addition to executing the test with value-matching and the test without value-matching, the loop code was executed with no internal test code. Two versions of empty loop were used, one with the number generation code and one without. These two tests provided baseline timings that could be subtracted from the actual test times.

Timing
The timing of the tests was controlled automatically using the Stopwatch class. Each test was performed repeatedly and the mean average of the results calculated.

Code
Click this link to download the code for the class used to perform the tests.

Test Conditions
Hardware
The test results included below are those produced using an Athlon64 3200+ with 4GB RAM. These tests are indicative of further relative test results that were carried out on a variety of equipment including:

IBM ThinkPad R51 notebook with a 1.6GHz processor and 2GB RAM
JVC MiniNote notebook with a 1GHz processor and 768MB RAM
The tests were executed using three operating systems, each with the latest service packs and patches. These were:

Windows XP
Windows Server 2003 R2
Windows Vista Ultimate
In each test, the software was compiled as a .NET framework 2.0 console application in three configurations:

Compiled for 32-bit processors only using Visual Studio 2005 Team Edition
Compiled for 32-bit or 64-bit processors using Visual Studio 2005 Team Edition
Compiled using Visual C# 2005 Express Edition
Results
Raw Results
This table shows the timings for the empty loop, if statement and switch statement for one billion iterations rounded to the nearest millisecond. The two columns show the results for the loops where matching of the comparison value occurs and for the non-matching tests.

Matching Non-Matching
Empty Loop 27.3s 5.0s
Switch Statement 43.0s 5.0s
If Statement 48.0s 5.1s

Adjusted Results
This second table shows the results for the two key test types, adjusted to remove the delay created by the looping mechanism.

Matching Non-Matching
Switch Statement 15.7s 0.0s
If Statement 20.7s 0.1s

Conclusion
The results show that the switch statement is faster to execute than the if-else-if ladder. This is due to the compiler's ability to optimise the switch statement. In the case of the if-else-if ladder, the code must process each if statement in the order determined by the programmer. However, because each case within a switch statement does not rely on earlier cases, the compiler is able to re-order the testing in such a way as to provide the fastest execution.

Friday, June 22, 2007

Exception Handling

C# Exception Handling

What is an Exception?
An exception is an error or failure that occurs when a program is executing. Generally, an exception describes an event that was unexpected. For example, an exception will occur if a program requests more memory than the operating system can provide. This error is known as an Out of Memory Exception.

The Exception Class Hierarchy
When an exception occurs normal processing of the code ceases immediately. An object is then generated that contains information relating to the error. This object is an instance of an Exception class. The most basic of the exception classes is System.Exception, the Exception class defined in the System namespace.

The exception classes are organised into a hierarchy with Exception at the top. Beneath the basic Exception class are two further classes, SystemException and ApplicationException, both in the System Namespace. The SystemException class has further derived subclasses, each representing a specific type of exception that can be raised in response to a system error. The ApplicationException class is a base class from which custom application exceptions may be derived.

NB: Subclass and base class are object-oriented programming terms. Object oriented programming is beyond the scope of the C# Fundamentals tutorial and will be described in a future series of articles.

System Exceptions
There are many system exceptions that may be raised, or thrown, when a problem occurs. These cause the generation of an object based on its own specialised exception class derived from SystemException. Examples include:

ArithmeticException. Thrown when an error occurs during an arithmetic operation or during casting or converting.
DivideByZeroException. Thrown when an attempt to divide a value by zero occurs. The DivideByZeroException is a more specialised version of ArithmeticException.
OverflowException. Thrown when an error occurs during an arithmetic operation or during casting or converting because the resultant value is too large or too small. The OverflowException is derived from ArithmeticException.
OutOfMemoryException. Thrown when the available memory is insufficient to continue execution of a program.
The above list gives a small sample of some of the exception classes within the hierarchy. A much more comprehensive list can be found in the System Exceptions Hierarchy page of Microsoft's MSDN web site.

Application Exceptions
Application exceptions are defined by the programmer and can be quite generic, for example WordProcessorException, or very specialised, such as LeftMarginTooSmallException. Of course, neither of these exceptions exists within the .NET framework; they must be created before use. For this reason, application exceptions will not be discussed in this article. Instead they will be described in the next article, which investigates programmatically throwing both standard and custom exceptions.

Handling Exceptions
Unhandled Exceptions
When an exception occurs the program flow for the executing method is immediately interrupted. If the exception is not handled explicitly, the method exits and the exception is propagated back to the calling function. This calling method then has the opportunity to handle the error. This process continues until the exception is handled by the program or, if never handled, it reaches the C# runtime system.

An unhandled exception that reaches the C# runtime system causes the immediate, abnormal termination of the program. This can be a problem as the exception is reported to the user as a message or dialog box containing standard information and technical details that may be misunderstood. During debugging this may be useful but in a production system it is generally considered to be unacceptable. It can also permit the user to attempt to continue running a program that, due to errors, has become unstable; note the Continue button in the error dialog box below:

The Basic Try / Catch Block
C# provides a code structure known as the try / catch block that permits the handling of exceptions. A basic try / catch block has two elements. The try section contains a series of commands to be executed, held within the brace characters { and }. The catch section contains code to execute should an exception occur during processing of the try section. The basic syntax is as follows:

try
{
// commands to execute whilst checking for exceptions
}
catch
{
// commands to execute if an exception occurs
}When using this basic exception capturing syntax, any exception that occurs causes the code in the try block to be left and the code in the catch block to be executed. The catch block code can be used for various purposes including graceful recovery from the error, logging or reporting of the details of the problem, and freeing up resources such as database connections or open files. Once the catch block has finished executing, or if no exception occurs within the try block, the program continues with the next statement after the try / catch structure.

The following example attempts to divide a value by zero causing an exception to be thrown. In this case, a simple error message is reported to the user and the calculated value is set to the highest possible integer value.

static void Main(string[] args)
{
int value = 50;
int divisor = 0;
int calculated;

try
{
calculated = value / divisor;
}
catch
{
Console.WriteLine("An error occurred during division.");
calculated = int.MaxValue;
}

Console.WriteLine("Result = {0}", calculated);
}

/* OUTPUT

An error occurred during division.
Result = 2147483647

*/Extracting Exception Information
As described earlier in this article, the throwing of an exception includes the generation of an exception object. the object has properties containing information describing the error that occurred. This information can be made available to the code within the catch code block by adding an exception declaration to the catch statement. The following example extends the previous code by declaring an object of the most generalised Exception class. All exceptions will still be caught but now the exception information may be used when generating the error message.

static void Main(string[] args)
{
int value = 50;
int divisor = 0;
int calculated;

try
{
calculated = value / divisor;
}
catch (Exception ex) // Catch any exception
{
Console.WriteLine("An error occurred during division.");
Console.WriteLine(ex.Message); // Report the error message
calculated = int.MaxValue;
}

Console.WriteLine("Result = {0}", calculated);
}

/* OUTPUT

An error occurred during division.
Attempted to divide by zero.
Result = 2147483647

*/The above example catches any error and populates an object of class Exception. The Message property of the object is then used to output an error description. This is one of several useful properties that are provided by the Exception class and all other derived exception types. Some of the most useful properties are:

Message. A string providing a brief description of the exception.
Source. A string containing the name of the program or object that caused the exception.
TargetSite. An object containing the name and other details of the method that caused the exception.
StackTrace. A string containing the complete stack of calls that led to the exception. This string allows the programmer to review each method call made up until the exception occurred. This is especially useful during testing and debugging.
InnerException. When one exception occurs as the direct result of another exception, the initial exception may be held in this property. This allows interrogation of both objects. The inner exception contains all of the standard properties including, potentially, a further InnerException. If there is no inner exception, this property is null.
NB: More specialised exception types may include further relevant information. For example, the ArgumentException and derived exceptions that are thrown when a parameter passed to a method is invalid include a 'ParamName' property detailing the parameter in question.

static void Main(string[] args)
{
int value = 50;
int divisor = 0;
int calculated;

try
{
calculated = value / divisor;
}
catch (Exception ex)
{
Console.WriteLine("Message: {0}\n", ex.Message);
Console.WriteLine("Source: {0}\n", ex.Source);
Console.WriteLine("TargetSite: {0}\n", ex.TargetSite.Name);
Console.WriteLine("StackTrace: {0}\n", ex.StackTrace);

calculated = int.MaxValue;
}

Console.WriteLine("Result = {0}", calculated);
}

/* OUTPUT

Message: Attempted to divide by zero.

Source: ConsoleApplication1

TargetSite: Void Main(System.String[])

StackTrace: at ConsoleApplication1.Program.Main(String[] args) in
C:\...\Program.cs:line 17

Result = 2147483647

*/Catching Specific Exceptions
So far, the examples described have included code to catch all exceptions. However, sometimes you will want to catch only a specific type of exception so that different problems can be handled in different ways. In order to catch a more specialised exception only, the correct class of exception is named in the catch statement. The following example uses this method to only catch division by zero. Any other exception would remain unhandled.

static void Main(string[] args)
{
int value = 50;
int divisor = 0;
int calculated;

try
{
calculated = value / divisor;
}
catch (DivideByZeroException ex) // Catch specific exception only
{
Console.WriteLine("Division by zero occurred.");
Console.WriteLine(ex.Message); // Report the error message
calculated = int.MaxValue;
}

Console.WriteLine("Result = {0}", calculated);
}

/* OUTPUT

Division by zero occurred.
Attempted to divide by zero.
Result = 2147483647

*/Catching specific exception types provides two benefits. Firstly, unexpected exceptions such as out of memory are not caught and misinterpreted or masked causing unexpected side effects. Secondly, any additional properties associated with the specialised exception class are made available in the catch block.

NB: If catching the exception type is enough and it is not necessary to interrogate the exception properties then there is no need to include a variable name for the exception object. The catch in the above example could be shortened to catch (DivideByZeroException) in such a situation.

Catching Multiple Exceptions
When developing complex routines, it is possible that many different types of exception could occur within a block of code or even within a single instruction. Each of these exceptions may require handling in a different manner. To permit this, multiple catch blocks may be added to a single try block. Each of these catch blocks handles a different kind of exception with the most specific exception types processed first and the most generic exceptions last.

Each catch block is checked in turn to see if the exception thrown is the same type as, or derives from, that declared in the catch statement. When a match is found, the code within the catch block is executed. Only one catch block's code is ever executed. Exceptions thrown that do not match any of the declared types remain unhandled.

The following example includes three catch blocks. The first handles any division by zero error. The second responds to a less general arithmetic exception but is not used if division by zero occurs. The final catch block does not specify the type of exception to catch and therefore is executed when any other type of exception occurs.

static void Main(string[] args)
{
int value = 50;
int divisor = 0;
int calculated;

try
{
calculated = value / divisor;
}
catch (DivideByZeroException) // Division by zero?
{
Console.WriteLine("Division by zero occurred.");
calculated = int.MaxValue;
}
catch (ArithmeticException) // Arithmetic exception?
{
Console.WriteLine("An arithmetic exception occured.");
calculated = int.MaxValue;
}
catch
{
Console.WriteLine("An unexpected exception occured.");
calculated = int.MaxValue;
}

Console.WriteLine("Result = {0}", calculated);
}

/* OUTPUT

Division by zero occurred.
Attempted to divide by zero.
Result = 2147483647

*/The Try / Catch / Finally Block
Sometimes it is necessary to ensure that some code executes whether an exception occurs or not. For example, if a file is opened before the try block, this file should be properly closed following successful processing or an exception.

C# defines an addition block that may be added to the end of the try / catch code structure. This is known as the finally block. The code within this section is guaranteed to be executed after the try / catch block, even if any of the statements in the try / catch block caused the current method to exit normally or by throwing a further exception.

static void Main(string[] args)
{
int value = 50;
int divisor = 0;
int calculated;

try
{
calculated = value / divisor;
}
catch
{
Console.WriteLine("An error occurred during division.");
calculated = int.MaxValue;
}
finally
{
Console.WriteLine("Clearing up any resources.");
}

Console.WriteLine("Result = {0}", calculated);
}

/* OUTPUT

An error occurred during division.
Clearing up any resources.
Result = 2147483647

*/It is possible to use try and finally together with no catch blocks. If an exception occurs when using such a structure it remains unhandled and is thrown to the calling routine or to the C# runtime system. However, the code in the finally block is executed whether an exception is raised or not.

Monday, June 18, 2007

C#

Monday, June 18, 2007C# An Extensive Examination of Data Structures Using C# 2.0

Summary: This article kicks off a six-part article series that focuses on important data structures and their use in application development. We'll examine both built-in data structures present in the .NET Framework, as well as essential data structures we'll build ourselves. This first part focuses on an introduction to data structures, defining what data structures are, how the efficiency of data structures are analyzed, and why this analysis is important. In this article, we'll also examine two of the most commonly used data structures present in the .NET Framework: the Array and List. (14 printed pages)

Editor's note This six-part article series originally appeared on MSDN Online starting in November 2003. In January 2005 it was updated to take advantage of the new data structures and features available with the .NET Framework version 2.0, and C# 2.0. The original articles are still available at

Contents
Introduction
Analyzing the Performance of Data Structures
Everyone's Favorite Linear, Direct Access, Homogeneous Data Structure: The Array
Creating Type-Safe, Performant, Reusable Data Structures
The List – a Homogeneous, Self-Redimensioning Array
Conclusion

Introduction
Welcome to the first in a six-part series on using data structures in .NET 2.0. This article series originally appeared on MSDN Online in October 2003, focusing on the .NET Framework version 1.x, and can be accessed at
Version 2.0 of the .NET Framework adds new data structures to the Base Class Library, along with new features, such as Generics, that make creating type-safe data structures much easier than with version 1.x. This revised article series introduces these new .NET Framework data structures and examines using these new language features.

Throughout this article series, we will examine a variety of data structures, some of which are included in the .NET Framework's Base Class Library and others that we'll build ourselves. If you're unfamiliar with the term, data structures are classes that are used to organize data and provide various operations upon their data. Probably the most common and well-known data structure is the array, which contains a contiguous collection of data items that can be accessed by an ordinal index.

Before jumping into the content for this article, let's first take a quick peek at the roadmap for this six-part article series, so that you can see what lies ahead.

In this first part of the six-part series, we'll look at why data structures are important, and their effect on the performance of an algorithm. To determine a data structure's effect on performance, we'll need to examine how the various operations performed by a data structure can be rigorously analyzed. Finally, we'll turn our attention to two similar data structures present in the .NET Framework: the Array and the List. Chances are you've used these data structures in past projects. In this article, we'll examine what operations they provide and the efficiency of these operations.

In the Part 2, we'll explore the List's "cousins," the Queue and Stack. Like the List, both the Queue and Stack store a collection of data and are data structures available in the .NET Framework Base Class Library. Unlike a List, from which you can retrieve its elements in any order, Queues and Stacks only allow data to be accessed in a predetermined order. We'll examine some applications of Queues and Stacks, and see how these classes are implemented in the .NET Framework. After examining Queues and Stacks, we'll look at hashtables, which allow for direct access like an ArrayList, but store data indexed by a string key.

While arrays and Lists are ideal for directly accessing and storing contents, when working with large amounts of data, these data structures are often sub-optimal candidates when the data needs to be searched. In Part 3, we'll examine the binary search tree data structure, which is designed to improve the time needed to search a collection of items. Despite the improvement in search time with the binary tree, there are some shortcomings. In Part 4, we'll look at SkipLists, which are a mix between binary trees and linked lists, and address some of the issues inherent in binary trees.

In Part 5, we'll turn our attention to data structures that can be used to represent graphs. A graph is a collection of nodes, with a set of edges connecting the various nodes. For example, a map can be visualized as a graph, with cities as nodes and the highways between them as edged between the nodes. Many real-world problems can be abstractly defined in terms of graphs, thereby making graphs an often-used data structure.

Finally, in Part 6 we'll look at data structures to represent sets and disjoint sets. A set is an unordered collection of items. Disjoint sets are a collection of sets that have no elements in common with one another. Both sets and disjoint sets have many uses in everyday programs, which we'll examine in detail in this final part.

Analyzing the Performance of Data Structures
When thinking about a particular application or programming problem, many developers (myself included) find themselves most interested about writing the algorithm to tackle the problem at hand or adding cool features to the application to enhance the user's experience. Rarely, if ever, will you hear someone excited about what type of data structure they are using. However, the data structures used for a particular algorithm can greatly impact its performance. A very common example is finding an element in a data structure. With an unsorted array, this process takes time proportional to the number of elements in the array. With binary search trees or SkipLists, the time required is logarithmically proportional to the number of elements. When searching sufficiently large amounts of data, the data structure chosen can make a difference in the application's performance that can be visibly measured in seconds or even minutes.

Since the data structure used by an algorithm can greatly affect the algorithm's performance, it is important that there exists a rigorous method by which to compare the efficiency of various data structures. What we, as developers utilizing a data structure, are primarily interested in is how the data structures performance changes as the amount of data stored increases. That is, for each new element stored by the data structure, how are the running times of the data structure's operations effected?

Consider the following scenario: imagine that you are tasked with writing a program that will receive as input an array of strings that contain filenames. Your program's job is to determine whether that array of strings contains any filenames with a specific file extension. One approach to do this would be to scan through the array and set some flag once an XML file was encountered. The code might look like so:

Copy Code
public bool DoesExtensionExist(string [] fileNames, string extension)
{
int i = 0;
for (i = 0; i < fileNames.Length; i++)
if (String.Compare(Path.GetExtension(fileNames[i]), extension, true) == 0)
return true;

return false; // If we reach here, we didn't find the extension
}
}

Here we see that, in the worst-case—when there is no file with a specified extension, or when there is such a file but it is the last file in the list—we have to search through each element of the array exactly once. To analyze the array's efficiency at sorting, we must ask ourselves the following: "Assume that I have an array with n elements. If I add another element, so the array has n + 1 elements, what is the new running time?" (The term "running time," despite its name, does not measure the absolute time it takes the program to run, but rather refers to the number of steps the program must perform to complete the given task at hand. When working with arrays, typically the steps considered are how many array accesses one needs to perform.) Since to search for a value in an array we need to visit, potentially, every array value, if we have n + 1 array elements, we might have to perform n + 1 checks. That is, the time it takes to search an array is linearly proportional to the number of elements in the array.

Posted by Ankit at 6:02 PM 0 comments

Subscribe to: Posts (Atom)

Tech Savy

Friday, June 29, 2007

Speed Switch v/s if Else

Friday, June 22, 2007

Exception Handling

Monday, June 18, 2007

C#

Blog Archive

Search