Tuesday, October 9, 2007

Weighted Average Continued: C# Code

The previous post described how to load the function WAVG() into SQL Server. This post describes the code that generates the DLL.

This discussion assumes that the reader is familiar with C# or object oriented languages similar to C#, such as C++ or java. That said, the code itself is probably readable by most people who are familiar with object-oriented programming practices.

This discussion is composed of four parts:
  • Very basic discussion of Microsoft Visual Studio;
  • Overview of the code and auxiliary modules;
  • Code for Adding CreateWeightedValue() Function;
  • Code for Adding WAvg() Aggregation Function; and,
  • Code for Adding WeightedValue Type.

The last three specifically describe code. These are ordered by difficulty, since it is easiest to add a user defined function, then an aggregation, and then a type (at least in terms of the volume of code produced). The code containing these is available here.

Overview of Microsoft Visual Studio
Microsoft Visual Studio is the application used to develop C# code (as well as code in other languages) using the .NET framework.

Visual Studio divides work into units called projects. These consist of one or more sets of files containing programming code, and they produce something. This something could be many things:
  • A windows application;
  • A ".exe" file executed from the command line;
  • A library to be shared among other projects;
  • A dynamic load library (DLL);
  • A device driver;
  • and so on.
The thing that we want to create is a DLL, since this can be loaded as an assembly into SQL Server.

For the purposes of this example, I have created a new project called blog in the directory c:\gordon\c-sharp\UserDefinedFunctions. The screen looks like:

Creating such a file automatically opens a code file. The source code for Wavg() is can be copied and placed into this file.After the code is in place, the file is created by going to the Build-->Build Blog menu option. Any errors appear at the bottom of the screen. Visual Studio does a good job of catching errors during the compilation process.

Once the project has been built, the DLL is conveniently located in the path \blog\blog\bin\debug\blog.dll. It can be loaded into SQL Server from this location, copied to a more convenient location, and even emailed or moved onto another computer.

Obviously, there is much more to say about Visual Studio. For that, I recommend Microsoft documentation or simply playing with the tool.

Overview of Code and Auxiliary Modules
Converting C# code into a DLL that can be loaded into SQL Server is a tricky process. In particular, .NET has to be sure that streams of bits represent what they are supposed to represent in both C# and SQL Server. In fact, this is a problem. For instance, by default, database values can be NULL. And yet, this is not part of any native C# type. To support compatibility between the systems, the code includes various using clauses and compiler directives.

However, the bulk of the C# code for this project consists primarily of three class definitions. The class WeightedValue defines the type weighted value, which holds a numeric value and a numeric weight (as C# doubles). The class WAvg defines the aggregation function. Finally, the CreateWeightedValue() function is a member of another class, UserDefinedFunctions. Note that the names of the first two classes match the names of the type and aggregation function respectively. The name of the third class is arbitrary, but carefully chosen to convey the notion that it contains user defined functions.

The beginning of the C# library consists of a series of "using" steps. These specify additional modules used by C#, and are similar to the "#include" preprocessor directive in C and C++ code. For instance, this code has the following references:

using System;
using System.IO;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

The first two specify various system classes that are commonly used. The last specifies classes used specifically for interfacing to Sql Server.

The third is the most interesting, because it defines the classes that contain data going between SQL Server and C#. These are SQL data types. For instance, FLOAT in SQL corresponds to SqlDouble in C#. Basically, the C# classes encapsulate the basic class with a NULL flag.

However, there are some subtleties when passing data back and forth. "CHAR()" is not supported, although "NCHAR()" is. Fortunately, SQL Server automatically converts between these types.

More insidious is the fact that the length of strings and numerics and decimals and money all have to be specified. So, I have not figured out how create a function that takes arbitrary numeric values. User defined functions can only take numerics of a given length. Of course, we can define our own numeric value that never overflows. More typically, though, we simply declare functions to take a FLOAT. This is sufficient for most purposes and gets passed to C# as SqlDouble. For characters, we define them to take some long character value, such as NVARCHAR(2000), which is converted to SqlString.

More complete matching tables are available in Microsoft documentation: http://msdn2.microsoft.com/en-us/library/system.data.sqltypes(vs.71).aspx.

In addition to the using statement, there are also compiler directives. These are applied to classes and to member in classes, as we will see below.

Code for Adding CreateWeightedAverage() Function
The following code provides the full definition for the CreateWeightedAverage() function.
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static WeightedValue
CreateWeightedValue (SqlDouble val, SqlDouble wgt)
{
if (val.isNull wgt.IsNull)
{
return WeightedValue.Null;
}
return new WeightedValue(val, wgt);
} // CreateWeightedValue()
} // UserDefinedFunctions

This code defines a class called UserDefinedFunctions. The partial keyword simply means that the class definition may be split over several source code files. In this case, that is not the case.

The function itself is a static member of this class, so it can be called without an instance of the class being created. In fact, the function is going to be called from SQL Server. The function itself starts with a compiler directive that specifies that this is, in fact, a SQL Server function.

The remainder of the function declaration specifies the arguments and the return type. The code in the body is quite simply.

Recall from the last posting that this function is added into SQL Server using the following code:

CREATE FUNCTION CreateWeightedValue (@val float, @wgt float)
RETURNS WeightedValue as
EXTERNAL NAME ud.UserDefinedFunctions.CreateWeightedValue

This shows the correspondence between the SQL Server and C# language elements.

Code for Adding WAvg() Aggregation Function
The code for an aggregation is more complicated than the code for a single function, so the entire code is not included here. An aggregation class has the following structure:
[Serializable]
[SqlUserDefinedAggregate(Format.UserDefined)
public class WAvg : IBinarySerialize
{
private double sum;
private double cnt;

public void Init ()
{ . . . }

public void Accumulate (WeightedValue value)
{ . . . }

public void Merge (WAvg other)
{ . . . }

public SqlDouble Terminate ()
{ . . . }

public void Write (BinaryWriter w)
{ . . . }

public void Read (BinaryReader r)
{ . . . }
} // WAvg

The way this function works is quite simple. It maintains a running sum and sum of weights, When finished, it returns the sum divided by the sum of the weights. Notice that an aggregation function is really a class that contains data, along with some members of that class.

More interesting is the definition itself. First, the function has two compiler directives. The first is [Serializable]. This directive means that the data in the class can be passed back and forth between SQL Server and C#. In particular, it means that there are two special functions that are going to be defined, Write() and Read(). These are never called explicitly, but are part of the interface. These functions "write" the data to memory and then "read" it back . . . the data is written in one process (SQL Server or C#) and then read back in the other.

The second compiler directive specifies that the class is for an aggregation function. Because the type has "complicated" data, C# does not know how to read and write it automatically. This is true of almost all data types, so the format is typically UserDefined. In addition, other options are available as well, and explained in Microsoft documentation.

The class itself is an instance of IBinarySerialize. This is also part of the serialization stuff. Also, once this inheritance is set up, the Write() and Read() functions must be defined or else there is a compiler error.

The other four functions in the interface are actually useful for doing the aggregation. They are not called explicitly, but are used by SQL Server to do the work. The first is Init(), which initializes the values in the class to start a new aggregation. In this case, it sets the sum and the weight to zero.

The function Accumulate() adds in another value. Unfortunately, the accumulation function can only take one argument, which is why we need to create a special type that contains two values. In this case, it simply increments the sum and weight values.

The third function Merge() is probably the lead obvious of the four functions. This function merges two aggregation values. Why would this ever happen? The reason is parallelism. SQL Server might separate the aggregation into multiple threads for performance reasons. This brings together the intermediate results. One super nice thing about this structure is that we get the benefits of parallel, multi-threaded performance without really having to think about it. A very nice thing indeed.

The final function Terminate() is the most harshly named of the four. It returns the final value, in this case as a SQL floating point value (which is equivalent to a C# double).


Code for Adding WeightedValue Type
The final section is for adding in the type. This is similar to the aggregation class, although slightly different.

The following shows the various functions in the user defined type.

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedType(Format.UserDefined)
public class WeightedValue : INullable, IBinarySerialize
{
private bool isNull;
private double value;
private double weight;

public WeightedValue ()
{ . . . }

public bool IsNull
{ . . . }

public static WeightedValue Null
{ . . . }

public override string ToString ()
{ . . . }

public static WeightedValue Parse (SqlString s)
{ . . . }

public void Write (BinaryWriter w)
{ . . . }

public void Read (BinaryReader r)
{ . . . }

public WeightedValue (SqlDouble val, SqlDouble wgt)
{ . . . }

public double Value
{ . . . }

public double Weight
{ . . . }

} // WeightedValue

Although the aggregation function and user defined type are both defined as classes, they do have some fundamental differences. The biggest one is that the type is actually inside SQL Server. This means that the various methods are all available -- even methods not required by the API. This can be a handy way of adding functionality, without bothering to create new user defined functions.

Certain functions are required for a user defined type. First, a constructor is needed that takes no arguments. This usually returns the NULL value. Note that C# does not require destructors.

In addition, NULL and IsNull are also required. Their absence is caught by the compiler, because the user defined type should be an example of INullAble.

The functions Write() and Read() are also required. These behave the same way as for the aggregation function.

Finally, the functions ToString() and Parse() are required. These are very similar to Write() and Read() except that they write the values out to character strings. These strings are intended to be output and input to human beings, preferably in a somewhat understandable format. The function is ToString() particularly useful because it can be used in a query to see what is stored in a value of the type.

This user defined type makes all its elements private, and creates separate accessors for each one (these are called Weight and Value). This is not a requirement. However, it can be good programming practice for protecting members of a class.

The next post extends these ideas for another useful aggregation function, one that returns the value of one column when the value in another is minimized.





14 Comments:

Blogger TomMc said...

I am attempting to download the example c# code for the SQL Server weighted average DLL, but am not able to do so. Do I need to sign in somehow?

September 21, 2011 at 6:58 PM  
Blogger ananthi sri said...

I really enjoyed this article. I need more information to learn so kindly update it.
Salesforce Training in Chennai
salesforce training in bangalore
Salesforce Course in bangalore
salesforce training institute in chennai
salesforce developer training in chennai
best salesforce training in bangalore
Big Data Course in Coimbatore
DOT NET Training in Bangalore

September 24, 2019 at 8:32 AM  
Blogger Nathandigi said...

Nice post. Thanks for sharing! I want people to know just how good this information is, It’s interesting content and Great work.
360DigiTMG digital marketing course hyderabad

February 19, 2020 at 5:00 AM  
Blogger Crack Mods said...

your posting style is very awesome thanx for sharing keep it up........ Office 2016 Permanent Activator Crack Ultimate Full Download 2020!

March 9, 2020 at 1:11 PM  
Blogger cathrine juliet said...

I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing..

Big Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery


June 19, 2020 at 4:10 AM  
Blogger James David said...

Red Hat Certified Engineer is a professional who has expertise in handling the Red Hat Enterprise Linux System. The Certified Engineer takes care of various tasks such as setting kernel runtime parameters, handling various types of system logging and providing certain kinds of network operability. The professionals must have the ability to install networking services and security on servers running Red Hat Enterprise Linux.

Red Hat Certified Engineer

August 11, 2020 at 4:54 AM  
Blogger Shoppa.in said...

I am impressed. I don't think I've met anyone who knows as much about this subject as you do. You are truly well informed and very intelligent. You wrote something that people could understand and made the subject intriguing for everyone. Really, great blog you have got here.
B2B Marketplace
Wholesalers
Our Service Partners
Refrigerator Repair Service
Satta King
Mumbai Call Girls

November 2, 2020 at 8:00 AM  
Blogger Huongkv said...

Mua vé rẻ tại Aivivu, tham khảo

vé máy bay đi Mỹ

ve may bay tet gia re 2021

vé máy bay đi Canada bao nhiêu tiền

vé máy bay đi Pháp giá rẻ

vé máy bay đi Anh quốc

vé máy bay giá rẻ trong tháng

combo nghỉ dưỡng đà lạt

combo nha trang vinpearl

bảng giá visa trung quốc 2021

December 21, 2020 at 11:15 PM  
Blogger jegan said...

wonderful article contains lot of valuable information. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.
This article resolved my all queries.good luck an best wishes to the team members.continue posting.learn digital marketing use these following link
Digital Marketing Course in Chennai

January 11, 2021 at 9:20 AM  
Blogger sri said...

Thanks for sharing this.,
Leanpitch provides online training in Scrum Master during this lockdown period everyone can use it wisely.
Join Leanpitch 2 Days CSM Certification Workshop in different cities.
Scrum master certification online
CSM certification online

March 18, 2021 at 3:44 AM  
Blogger Mallela said...

Thanks for posting the best information and the blog is very helpful.digital marketing institute in hyderabad

April 23, 2021 at 11:29 PM  
Blogger Deekshitha said...

Informative blog
best digital marketing institute in hyderabad

April 24, 2021 at 4:21 AM  
Blogger Sunil said...

Impressive Article, thanks for sharing.
Top 10 Digital Marketing Agencies in Hyderabad


April 28, 2021 at 12:01 AM  
Blogger aivivubooking1 said...

Aivivu chuyên vé máy bay, tham khảo

vé máy bay đi Mỹ Vietnam Airline

giá vé máy bay vinh sài gòn ngày mại

giá vé vietjet sài gòn hà nội

vé máy bay sài gòn đà lạt pacific airlines

lịch bay từ mỹ về việt nam hôm nay

taxi đi sân bay giá rẻ

combo anantara quy nhơn

May 7, 2021 at 6:05 AM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home