I got this in some spam today. I had to have a look, and I wasn't disappointed. I'll have to phone my dad and find out what his maiden name is. Or maybe his maidles name.
July 2009 Archives
On a recent interview, I was asked: "Why doesn't C# have keywords for const-correctness?"
There are a few answers. First, C# does have constants and const class members. You can declare class members as const or readonly to get compile-time and run-time constants, respectively.
And it's even possible to write classes that are immutable, so that you cannot modify an existing one but have to make a new one, possibly based on an existing one. It's like strings:
s.replace("The", "An");
does not modify the original string. Instead, you have to take the return value:
s = s.replace("The", "An");
It won't come as a great surprise to hear that I favor making classes immutable wherever they can be, as a means of clarifying their usage and to make them safer.
But the real question is: why doesn't C# have const methods (methods where the code promises not to modify class members) or methods that respect const parameters? The nearest I have found to an answer on that is Anders Hejlsberg in an online interview:
Immutables
Bill Venners: In addition to being a C# and CLR construct, the value type is a general object-oriented programming concept. Another such concept is immutable types.
When I went to the first JavaOne back in 1996, everyone seemed to have one complaint about what they missed from C++ that Java didn't have. Different people had different complaints, but they all seemed to have at least one complaint. My complaint was const. I really liked what I could do with const in C++, though somehow I have gotten along in Java just fine without it.
Did you consider including support for the concept of immutable directly in C# and the CLR?
Anders Hejlsberg: There are two questions in here. With respect to immutability, it's tricky because what you're saying when you say something is immutable, is that from an external perspective, I cannot observe any mutation. That doesn't necessarily mean that it doesn't have a cache inside that makes it go more efficiently. It's just on the outside it looks immutable. That's hard for a compiler to figure out. We could certainly have a rule that says you can only modify the fields of this object in the constructor. And we could make certain usability guarantees. But it actually rules out some scenarios that are in use. So we haven't codified immutability as a hard guarantee, because it's a hard guarantee to make. The concept of an immutable object is very useful, but it's just up to the author to say that it's immutable.
Bill Venners: Immutability is part of the semantics of the class.
Anders Hejlsberg: Yes. With respect to const, it's interesting, because we hear that complaint all the time too: "Why don't you have const?" Implicit in the question is, "Why don't you have const that is enforced by the runtime?" That's really what people are asking, although they don't come out and say it that way.
The reason that const works in C++ is because you can cast it away. If you couldn't cast it away, then your world would suck. If you declare a method that takes a const Bla, you could pass it a non-const Bla. But if it's the other way around you can't. If you declare a method that takes a non-const Bla, you can't pass it a const Bla. So now you're stuck. So you gradually need a const version of everything that isn't const, and you end up with a shadow world. In C++ you get away with it, because as with anything in C++ it is purely optional whether you want this check or not. You can just whack the constness away if you don't like it.
The interviewees approach "Immutability [as] part of the semantics of the class." They consider this the relevant feature, not semantics of individual methods of a class.
Eric Gunnerson also considers const-correctness "a good idea in the small, but doesn't work too well in the large." Furthermore, he points out that it is hard to enforce across multiple languages.
There are also many entries on StackOverFlow.com on this question (searchable here), and the best answer provided is here:
I suspect there are some practical reasons, and some theoretical reasons:
- Should the constness apply to the object or the reference? If it's in the reference, should this be compile-time only, or as a bit within the reference itself? Can something else which has a non-const reference to the same object fiddle with it under the hood?
- Would you want to be able to cast it away as you can in C++? That doesn't sound very much like something you'd want on a managed platform... but what about all those times where it makes sense in C++?
- Syntax gets tricky (IMO) when you have more than one type involved in a declaration - think arrays, generics etc. It can become hard to work out exactly which bit is const.
- If you can't cast it away, everyone has to get it right. In other words, both the .NET framework types and any other 3rd party libraries you use all have to do the right thing, or you're left with nasty situations where your code can't do the right thing because of a subtle problem with constness.
There's a big one in terms of why it can't be supported now though:
- Backwards compatibility: there's no way all libraries would be correctly migrated to it, making it pretty much useless :(
I agree it would be useful to have some sort of constness indicator, but I can't see it happening, I'm afraid.
So, approaches you can use to get immutability at the class level:
- immutable object pattern
Class is immutable by design, made so by the developer - collections returned as AsReadOnly
List and Array generic classes both support this method. - class members that are marked as readonly or const
The first two would support immutable method parameters -- you could not change them in the scope of the function and get side-effects. But C++-style const-correctness on methods is not available by language syntax.
Last week, I went to a django presentation at HUGE Inc. in Brooklyn. For me, the highlight was Kevin Fricovsky's five minute (well, ten minute) lightning talk in which he demonstrated the power of using open source components. He took a fairly small blog project of his and pulled it down from github.com. Then he used pip to get the dependencies listed in the requirements.txt file. Inside of ten minutes, he had a working blog that used five to ten open source components.
In contrast to this, I got a big reality check on using open source code. I wanted to try out some of Steve Souders's ideas on improving website performance on a django open source project that is already running. I forked some code from github and was going to apply django-compressor to the CSS and JS to see the performance improvement. I never got to this point.
The project was missing a couple of important pieces:
- a list of the requirements and how to obtain them
- the settings.py file to list the installed apps
Without these, I was pretty much dead. I soldiered on, though, and found eight required components to add, but:
- one of the components did not work with python 2.5 at head rev
- I got a subtle templetags error message that I eventually realized meant that I had a wrong similarly-named open source component
I've written to the project author to find out where to get the correct component. It's a bracing counterpoint to the presentation. Taking your own code and adding components -- likely to work. Taking a project in an unknown state and trying to back out what the components are -- much more likely to fail.
By the way, if you get a message like this:
'XXX' is not a valid tag library: Could not load template library from django.templatetags.XXX, No module named models
then I have a technique for you. Assuming that templatetags module XXX has a ton of import statements in its files, you have to figure out which file it is importing that is causing the failure. Take code like this:
from django import template from django.conf import settings from django.core import template_loader from django.db import models from django.contrib.contenttypes.models import ContentType from django.core.exceptions import ObjectDoesNotExist from django.utils.safestring import mark_safe from syncr.twitter.models import Tweet from syncr.delicious.models import Bookmark from syncr.flickr.models import Photo from tagging.templatetags import tagging_tags from tumblr.models import TumblrPost from lastfm.models import LastfmPost import re import datetime
and turn it into this:
try:
from django import template
from django.conf import settings
from django.core import template_loader
from django.db import models
from django.contrib.contenttypes.models import ContentType
from django.core.exceptions import ObjectDoesNotExist
from django.utils.safestring import mark_safe
except ImportError, exc:
print "django error in %s: %s" % (__file__, exc)
try:
from syncr.twitter.models import Tweet
from syncr.delicious.models import Bookmark
from syncr.flickr.models import Photo
from tagging.templatetags import tagging_tags
from tumblr.models import TumblrPost
from lastfm.models import LastfmPost
except ImportError, exc:
print "package error in %s: %s" % (__file__, exc)
try:
import re
import datetime
except ImportError, exc:
print "python error in %s: %s" % (__file__, exc)
Do this for every file in the module. Run the dev server and watch for the import error location reported. Once you know which block is failing in which file, add print statements to see which particular import you get to before it fails. Then you'll know which import is not working. In my case, this showed me that I had used the wrong open source tumblr module for this project.
James Bennett has some more detailed observations on this point, too.
I've been doing a lot of reading recently. Here are Django books I've finished:
- Python Web Development with Django
- SAMS Teach Yourself Django in 24 Hours
- Django 1.0 Template Development
- Pro Django
Here are general web development books I've read:
And a book on Git: Version Control with Git.
In addition, here are the books I am currently reading:
I particularly recommend Pro Django for django developers. It is advanced and offers a lot of insight into django and the python techniques it is built on.
The other day, I had an interview in which I was asked to "find an integer in a sorted array or integers if it was present." This is obviously a binary search, so I coded up this (but in C/C++):
def binarysearch(iter, val): lo, hi = 0, len(iter)-1 while lo <= hi: mid = lo + (hi - lo) / 2 if iter[mid] >= val : hi = mid - 1 else: lo = mid + 1 return (lo if iter[lo] == val else -1)
There are a couple of things that make this noteworthy:
- Calculation of mid
If you calculate the midpoint of the array in the more straightforward way (mid = (lo + hi) / 2), then you risk integer overflow when working with fixed sized integers.
- Use of a single comparison in the loop
It seems that you should actually test for an exact match on finding the element you are searching for. This is not so. More below.
- Returning iter[lo] instead of iter[mid]
It is not necessary to keep track of the last midpoint. When you fall out of the loop, either iter[lo] is the desired value or the value was not found in the array. If it is found, then in an array that has multiple instances of the value, lo is the location of the first occurrence.
It was on point two that the interview went a bit astray. The interviewer wanted to see code that looked more like this:
def binarysearch(iter, val): lo, hi = 0, len(iter)-1 while lo <= hi: mid = lo + (hi - lo) / 2 if iter[mid] == val : return mid elif iter[mid] > val: hi = mid - 1 else: lo = mid + 1 return -1
And that's all very well -- it's a correct algorithm -- but it's not as efficient. In my algorithm, there is only one comparison per loop. In the other, there is either one if there is a match or two on a failure. For comparison purposes and to make this concrete, let's imagine we have an array or 63 elements and we look for every element of the array.
Cost of my code
63 nodes searched * 6 comparisons per search 378 comparisons
Cost of other code
1 nodes searched * 1 comparisons = 1 2 nodes searched * 3 comparisons = 6 4 nodes searched * 5 comparisons = 20 8 nodes searched * 7 comparisons = 56 16 nodes searched * 9 comparisons = 144 32 nodes searched * 11 comparisons = 352 579 comparisons
So contrary to the expectations of the interviewer, it is actually more efficient to search to the end of the array every time using fewer comparisons than to search less of the array with more comparisons.
What are you supposed to do when you see something so flagrantly wrong as this (Yahoo: Ten Pricey Cities that Pay Off)?
When you pay a lot of money for something, you hope to get a lot of value in return. Despite the housing downturn and the number of cheap houses it has left in its wake, there is still plenty of expensive land left. According to the Global Property Guide, an online real estate investor's guide, New York might have slipped from the second-most-expensive city in the world in 2008 to the sixth most expensive in 2009, but an average 120-square-meter (373-square-foot) apartment in the central business district will still cost you a hefty $14,898 per square meter.
A square meter is not about 3 square feet. It's more than 10 square feet. You take the number of feet per meter (3.28 feet / meter) and square that to get the square feet per square meter (10.76 square feet / square meter). Is that so hard? The cited apartment is more like 1300 square feet, assuming the journalist got this original fact right.
