Building a LINQ provider - Step 0

Monday, Mar 23, 2009 4 minute read Tags: LINQ
Hey, thanks for the interest in this post, but just letting you know that it is over 3 years old, so the content in here may not be accurate.

Since I've started writing LINQ to Umbraco I have been doing a lot of investigation into the way that LINQ works and how to go about building your own custom LINQ provider. One thing I've noticed is there is a distinct lack of information on the web in about how to do this. Matt Warren has a really good series on building a LINQ provider, but it's still related to SQL translations. Bart De Smet is also a really great blogger who has done quite a bit on LINQ, it's kind of the LINQ guy. He's written the LINQ to SharePoint tool, LINQ to Active Driectory, LINQ to MSI, LINQ through PowerShell, and the list goes on. I really suggest you have a read through his posts. I'll be doing a lot of referencing to his posts throughout this series.

First off, a bit of a disclaimer. This series is a work in progress, I'll quite probably go back on what I say during the series as I'm still really learning what I'm trying to achieve. Everything I post here is stuff that I have learn by reading blog posts (and I'll link where applicable), reflecting LINQ to SQL and reading source code of open source projects such as LINQ to ShaerPoint.
This series is as much for myself as it is for anyone else. There a lot of stuff you need to know when it comes to building a query provider and trying to keep it all in my brain is really starting to hurt :P. I've already looked at sections of my code and had to think long and hard about what they do. I've also had a lot of code already refactored several times!

So lets get started!

Getting started - What provider model?
I'm making the assumption that you're chosen what you're going to provide a LINQ support for, now the question is how do you go about providing the LINQ support. There are two model of how to go about providing LINQ support, via IQueryable<T> or via method chaining.
Method chaining? What's that you say? As I pointed out in my post A LINQ Observation LINQ query syntax is really just syntactical suger; all statments ultimate compile down to chained method calls. This means that you can quiet easily provide a LINQ provider without implementing IQueryable<T>.
In fact, have a look in Reflector, there's actually not much which it implements. All the Where, Select, Join, etc statements reside within the the class System.Linq.Queryable (in System.Core) as extension methods! That means, if you provide a method with the construct:

public IEnumerable<TResult> Select<T, TResult>(Func<T, TResult> selector) { ... }

You can quite easily write your own LINQ provider. I could go on about this, but it's best covered by Bart De Smet in his post Q: Is IQueryable the Right Choice for Me?, The Most Funny Interface of the Year … IQueryable<T>, and then an example of how to achieve this is through his LINQ to MSI series (starting here).
The primary advantage of using method chaining over the top of IQueryable is that it allows you to have compile time checking of LINQ expressions. This is a very powerful concept if you don't have a data provider which is capable of supporting everything which LINQ has within it. LINQ to MSI is a great example. As Bart points out the MSI query language is very SQL-like, but it doesn't support all the operations. By implementing LINQ via the method chaining manner rather than via a full IQueryable interface when compiling a user will know what operations are and aren't available.
Don't take my word for it, have a read through the series. It's very interesting and it really opens your eyes to what LINQ is and how it is really implemented.

For LINQ to Umbraco I chose to use the IQueryable model. The reason for this is that LINQ to Umbraco (in its current implementationhint hint) is going to be querying against the Umbraco XML file. This means it is actually built on top of LINQ to XML. Since LINQ to XML already supports all the standard LINQ operations I don't see any point in restricting what the developer has within his toolkit. It is true that I'm not likely to ship all the operations (there's a hell of a lot to cover off!), but the framework will be there for all the standard operations to be supports. This does mean that until runtime there wont be any checking of the syntax so you're likely to have a NotImplementedException thrown, but hopefully the documentation will outline what is and isn't available *cough*.

Well hopeful this has given a starting point and some background reading for building a LINQ provider and given you some thinking about what to do before jumping straight into coding. The most important part of a LIQN provider is thinking it through from the outset!