We've Moved

The blog has been retired - it's up for legacy reasons, but these days I'm blogging at blog.theodox.com. All of the content from this site has been replicated there, and that's where all of the new content will be posted. The new feed is here

Tuesday, October 27, 2015

Grrrr..... Maya!!!

Some kinds of pain are just occasional: you stub your toe or bump your head, ouch, and then its over. Other kinds of pain aren't as sharp or as sudden... but they're chronic.  That persistent twinge in your lower back may not hurt as much as a twisted ankle - but it's going to be there forever (at least unless you get in to power Yoga, or so my wife claims).
 Maya is old enough to have a few of those chronic pains, and I just ran in to one which -- once we debugged it and figured it out -- I realized has been a constant irritant for at least the last decade and if my creaky old memory does not lie was a distinct pain in the butt as long ago as 2002. In another context I might even have been able to diagnose it but instead we spent a ton of time and energy working around an unexpected behavior which is, in fact, purely standard Maya. It's stupid Maya, but it's standard too.  Maya, alas, is double plus ungood about mixing per-face and per-object material assignments. So, I figured I'd document this here for future sufferers: it might not ease the pain much, but at least you'll know you're not crazy.
 The basic problem is that assigning materials to faces and to objects use slightly different mechanisms. If you check your hypergraph you'll see that per-face assignments connect to their shadingGroup nodes through the compInstObjectGroups[] attribute while object-level assigmemts go through the similar-but-not-identical instObjectGroups attribute (if you're looking for these in the docs, the component cone is inherited from the geometryShape class and the object version comes from dagNode).
 As long as you're working with one object at a time this isn't a problem. However, if you're duplicating or copy-pasting nodes, there's a gotcha:  If you ever try to merge meshes which have a mix of per-face and per-object assignments, Maya will magically "remember" old per-face assigments in the combined mesh.  If you're a masochist, here's the repro:

  1. create a object, give it a couple of different materials on different faces
  2. duplicate it a couple of times
  3. assign a per-object material to the duplicates, overriding the original per-face assignments
  4. combine all the meshes.
  5. Et voƮla! The cloned meshes revert to their original assignments

What appears to happen is that those compInstObjectGroups connections are driven by hidden groupID nodes which don't get deleted when the per-face assignments are overridden by the per-object ones in step (3) .  They stick around even though they aren't being used, and when the mesh is combined they step right back into their original roles.

If you're doing this interactively it's an annoyance. If you're got tools that do things like auto-combine meshes to cut down on transform load in your game.... well, it's a source of some surprising bugs and equally surprising bursts of profanity.  But at least it'ss predictable.

The workaround:  Before doing any mesh combination, delete the history and something harmless to this history of the meshes you're about to combine. (I use a triangulate step, since this happens only at export time) . That kills the rogue <code>groupID</code> nodes and keeps the combined mesh looking the way you intended.

Sheesh. What a way to make a living.

Tuesday, October 6, 2015

Charcoal - it's smokin'!

When you’re shopping around for something new – whether it’s a cool new piece of software or just a kitchen gadget – it’s not uncommon to tell yourself, “man, I wish I had thought of that.” But what’s really impressive is when you see a polished product and you say to yourself Dammnit, I absolutely thought of that!, or I’ve been wanting this exact thing for years!. It’s a rare thrill when you stumble across something that seems as if it were a gift from some future self, come back to give you exactly what you wanted in a way that only you, yourself could.
One of my coworkers found one of those little somethings the other day - a product that will make pretty much any TA go feel like Christmas came a little early.
The Charcoal Editor from Chris Zurbrigg is a slick, polished replacement for Maya’s script editor. It’s a plugin (available for Maya on Mac, Windows and Linux) that offers many of the features of a slick Python IDE right inside of Maya. Some of the key highlights include
  • Syntax highlighing
  • Autcomplete (including your own code and also the entire Maya cmds api)
  • Smart indent and dedent
  • Bracket matching
but the feature that will sell most Maya veterans instantly is the fact you can execute lines or scripts without the familiar Select > Enter that has deleted countless lines of your test code down the ages.

Charcoal Editor Overview from Chris Zurbrigg on Vimeo.

That one feature alone would probably be worth the price for most people who do a lot of scripting. But the whole package is thoughtfully put together in a way that clearly says the author wrote a tool for himself – and that he shares a lot of the frustrations that have driven you and I bonkers for the last 18 years of Maya history. A great example is the addition of quick help for Maya commands: if you (like me) can never remember the difference between the flags for listConnections and those for listRelatives, Charcaol allows you to pop up a quick in-window help view or to open the relevant documentation in a browser: a welcome alternative to the maddening ritual of entering “cmds.whatever” into Chrome and being directed to the Maya 2011 Japanese docs by the mysterious imps of the internet.
In general, Charcoal shows a lot of attention to the nuances of scripting work. For example, it allows you to quickly toggle layouts: Charcoal allows you to quickly flip back and forth between the usual split view and a full panel of either script or history, so you don’t have to give up coding space to see your printouts or vice-versa. Likewise, you can set font sizes and color schemes for the scripting panel and the history panel separately – a big help if you want to save space on your printouts or if (like me) your eyes are going and you need to bump up the font size for coding. The history panel even supports highlighting – separating errors and warnings clearly from regular printouts, for example. All in all it’s a collection of small touches that offers a much-appreciated sense that the program has your back and that the author has wrestled with many of the same irritations you’ve had.
The product also ventures into territory that’s useally associated with full-fledged IDEs. It particular it offers an “outline view” which displays the classes and functions in the current scope - a big help for navigating around in a longish file, as well as a handy way to remember what you’re working with. There’s also a “project view” which displays all of the scripts in a project folder tree – more or less the same as the project views in Sublime Text or Atom (two other scripter-friendly editors you should check out if you’ve never seen them.)
These IDE features will be very helpful for folks who’ve been soldiering on with nothing but the Maya script editor and Notepad. If you’re already using an IDE like PyCharm, Wing, or PTVS they may not be quite enough to wean you out of your fancy environment – particularly if you’r gotten used to using a real debugger instead of littering your code with print statements. Charcoal’s project features are functional but – given the nature of the task and the audience – are not as fancy as the equivalent features in big budget development environments. If you really prize the ablity to inifitely noodle on color themes, or a built-in style guide, you may find yourself wandering back to one of the bigger packages. That’s not a knock on Charcoal, though – it’s just a reminder that it’s a specialist tool for Maya users and not a general-purpose project management powerhouse.
For myself, I plan on sticking with PyCharm for long coding sessions (btw, PyCharm fans, you’ll be incredibly pleased to hear that Charcoal allows cut and paste directly from PyCharm, unlike Maya’s wonky script editor. Whoop-de-doo!) However Charcoal more than justifies itself as a replacement for the vanilla script editor with a lot of juicy productivity features. I’ve already gotten a lot of productivity bounce by using MayaCharm to bypass the Maya script editor whenever possible – but I still spend quite a lot of time in the clunky old Maya pane nonetheless. I’ve got high hopes that Charcoal will save precious brain power for real problems and allow me to focus more on doing my job and less on frantically hitting Undo after my last attempt to execute a line accidentally erased an hour’s work.
Charcoal offers a free, non-saving demo; an individual license is $49 US (site licenses are available but you’ll have to negotiate them with the author).

Wednesday, September 16, 2015

I don't endorse this...

.. But I could not resist. The original title was Automation comes from the Latin word meaning 'self', and 'mating', which means 'screwing'

Saturday, August 15, 2015

It's that time again! GDC Call for submissions is open!

It's time again for GDC speakers to submit their proposals.  TA's have a long history of providing excellent talks at the main conference and at the Tech Art Bootcamp - so take some time to put together a proposal for this year's conference.

For more on why this is a Good Ideatm, check out last year's post on the same topic.  One important note! This year the call for subs closes on August 27, NOT August 28 as it is in the linked post !

Saturday, August 8, 2015

code wars

By a certain stroke of cosmic irony, it was just after I finished shoe-horning lame Star Wars jokes into my last post that I started to get obsessed with CodeWars, one of the plethora of competitive coding sites that have sprung up in the last few years.

Mostly I find that sort of thing pretty annoying – it’s a genre that all too easily degenerates into macho brogrammer chest-thumping. 90 percent of the code I see on those sites is so tightly knotted – in hopes of scoring fewest-number-of-lines bragging rights – that it’s useless for learning. I’m impressed as hell by this:
f=lambda s:next((t,k)for t,k in map(lambda i:(s[:i],len(s)/i),range(1,len(s)+1))if t*k==s)
but I never want to have to interact with it (Bonus points if you can tell what it does! Post your answer in the comments....)
The nice thing about Codewars is that the experience tends to push you into thinking about how to solve the problems, rather than how to maximize your score. I particularly like two things: first, the site includes a built-in test framework so you can do The Right Thingtm and write the tests before you write the code – not only is it a helpful touch for would-be problem solvers its very effective ‘propaganda of the deed’ for encouranging people to take tests seriours. Second, the site doesn’t just show you the ‘best’ solutions, it shows you all of them – and it allows you to vote both for solutions you think are clever and ones you think embody “best practices.” That snippet I posted above is extremely clever but not a best practice – I wouldn’t let something like that into my codebase if I could avoid it! I’m not smart enough to unriddle such things, though I’m glad they exist.
The other nice thing is that most of the problems are bite-sized, the sort of thing you can chew on while waiting for a longish perforce sync to complete. It’s a great way to practice you coding chops outside all the gnarly things that come with working in a particular problems set for work. I’ve had a work task which involved me in a lot of 5-minute wait times this week and I found CodeWars to be a nice chance to do keep my brains warm while waiting for Perforce.
So, if you’re looking to sharpen up your coding skills you should definitely check out CodeWars. My username is Theodox and in the goofy ninja-academy language of Codewars we can form an ‘alliance’ by following one another. We can make technical art a power in the land!
On the practical side: codewars supports Python, Javascript, and several other languages – they just added C#. It’s great way to get familiar with new syntaxes and to see how folks who know what they are doing tackle problems natively, it’s a great tool to pick up a new language on your own. Give it a shot!

Sunday, August 2, 2015

Return of the namedtuples

I’m sure you’ve read or written code that looks like this:
results = some_function()
for item in results:
    cmds.setAttr(item[0] + "." + item[1], item[2])
Here some_function must be using one of Python’s handiest features, the ability to return lists or tuples of different types in a single function. Python’s ability to return ‘whatever’ - a list, a tuple, or a single object – makes it easy to assemble a stream of data in one place and consume it in others wihout worrying about type declarations or creating a custom class to hold the results. Trying to create a similarly flexible system in, say, C# involves a lot of type-mongering. So it’s nice.
At least, it’s nice at first. Unfortunately it’s got some serious drawbacks that will become apparent after a while – outside the context of a single script or function, relying entirely on indices to keep things straight is dangerous. As so often in Pythonia, freedom and flexibility can come at the cost of chaos downstream if you’re not careful.

I have a bad feeling about this…

Everything will be hunky-dory as long as some_function continues to pack its output the same way. In this example some_function is probably doing something like:

# imagine some actual code here ...
results = []
for node in object_list
    for attrib in attrib_list:
        settable = is_attrib_settable(node, attrib)
        if settable:
           new_value = dict_of_defaults[attrib]
           results.append ([node, attrib, new_value])
return results

Inevitably, though, something will come along that causes the order of the results to change. In a Maya example like this, for example, the likely cause would be some other user of this function finding out that the code needs to set defaults on an unusual value type. setAttr needs to be told what type of data to expect if things are unusual.
That being the case, your teammate extends some_function to output the data type needed. If you’re lucky, the results look like [node, attribute, value, type] and your existing code works fine. But if it changes to [node, attribute, type, value] your existing code will break in wierd ways. Moreover if you haven’t written a lot of comments, the person fixing the bugs will have to sit down and deduce what item[0], item[1] and item[2] were supposed to be.
This example is a perfect illustration unit tests are such a nice thing to have in Python-land: a unit test would probably catch the change in signature right away, alerting your helpful co-worker to the can of worms they have opened up by changing the output of the function. But the real moral of the story is how dangerous it is to rely on implicit knowledge of structures – like the ordering of a list – instead of on explicit instructions. When somebody fails to understand the implications of that ordering, bad things will happen. When the knowledge you need to debug the problem is hidden, things will be
even worse.
Sometimes things get complicated

Return classes strike back

In most languages the way around this is to create a class that holds the results of something like some_function. A result class provides clear, named access to what’s going on:
class SomeFuncResult(object):
     def __init__(self, node, attr, val):
         self.node = node
         self.attribute = attr
         self.value = val

 # and inside of some_function()
    results.append(SomeFuncResult(object, attrib, val))
This means the receiving code is much neater and easier to understand:
results = some_function()
for item in results:
    cmds.setAttr(item.node+ "." + item.attribute, item.value)
This is a better record of what you were trying to achieve in the first place, and it’s also much more survivable: as long as HelpfulCoworker01 does not actually rename the fields in the result object it can be tweaked and updated without causing problems.
For many cases this is the right way to go. However it comes with some drawbacks of its own.
First off – let’s be honest – there’s a lot of typing for something so dull. I always try to leave that out of the equation when I can - the time spent typing the code is such a tiny fraction of the time you’ll spend reading it that trying to save a few keystrokes is usually a Bad Idea (tm). However, typing 5 lines when you could just type a pair of bracket does feel like an imposition – particularly when the 5 lines are 100% boring boilerplate.
The second issue is that, being a class, SomeFuncResult is comparatively expensive: it costs a smidge more in both memory and processor time than just a list or a tuple of values. I’m ranking this behind the typing costs because most of the time that increment of cost doesn’t matter at all: if you’re dealing with a few hundred or even a few thousand of them, at a time the costs for spinning up new instances of SomeFuncResult just to hold data are going to be invisible to users. However if you are doing something more performance-intensive the costs of creating a full mutable object can be significant in large numbers. As always, it’s wiser not to try to optimize until things are working but this is still a consideration worth recalling.
The last issue is that SomeFuncResult can be changed in flight. Since it is a class, the data in a SomeFuncResult can be updated (for you CS types, it is mutable). This means some other piece of code that looks at the result object in between some_function and you might can decide to mess with the results. That can be a feature or a bug depending on how you want to code it – but since Python does not have a built-in mechanism for locking fields in an object, you’d have to put in extra work to make sure the results didn’t get changed by accident if keeping the data pristine was mission-critical. You can use the a property decorator to make a fake read only field:
class SomeFuncResult(object):
     def __init__(self, node, attr, val):
         self._node = node
         self._attribute = attr
         self._value = val

    def node(self):
        return self._node

    def attribute(self):
        return self._attribute

    def value(self):
        return self._value
Alas, our 5 lines of boilerplate have now blossomed into 16. Our quest for clarity is getting expensive.

One common way to get around the hassles – or at least, they typing costs –of custom return objects is simply to use dictionaries instead. If you use the perforce Python API you’ll be quite familiar with this strategy; instead of creating a class, you just return dictionaries with nice descriptive names
for node in object_list
    for attrib in attrib_list:
        settable = is_attrib_settable(node, attrib)
        if settable:
           new_value = dict_of_defaults[attrib]
           results.append ({'node':node, 'attribute':attrib, 'value':new_value})
return results

Like a custom class this increases readability and clarity; it’s also future proof since you can add more fields to the dictionary without messing with existing data.
Even better, dictionaries – unlike classes – are self-describing: in order to understand the contents of a custom result class like SomeFuncResult you’ll have to look at the source code, whereas you can see the contents of a result dictionary with a simple print statement. Dictionaries are slightly cheaper than classes (there is a good workaround to speed up classes, but it’s something you have to write and maintain). And, of course, dictionaries have minimal setup costs: they are boiler-plate free.
This doesn’t mean they are ideal for all circumstances, howerver.
The Achilles’ heel of using dictionaries is keys, which are likely to be strings. Unless you are very disciplined about using named constants for all your result dictionaries you’ll inevitably find that somebody somewhere has type attribite with an extra i instead of a u and suddenly perfectly valid, impeccably logical code is failing because nobody thought to look at the key names. Instead of typing lots of setup code once, you’ll be dribbling out square brackets and quotes till the end of time, with lots of little missteps and typos along the way. While that’s not an insurmoutable problem it’s another annoyance.

Not so scary when you know the secret!

Return of the namedtuples

Luckily there is yet another – and for most purposes better – way to return complex results — one that is both flexible and self-describing. namedtuples are part of the python standard library and they offer a clean, simple way to create lightweight objects that have named properties – like classes – but require almost no setup: you can create a new type of named tuple with a single line of code, and then use it like a lightweight (and immutable) class.
A namedtuple is just a python tuple that can also use names to access it’s own fields. For example:
from collections import namedtuple

# create a namedtuple called 'SomeFuncRes' to hold nodes, attributes and values
SomeFuncRes = namedtuple("SomeFuncRes", "node attribute value")

# make an instance
example = SomeFuncRes('pCube1', 'tx', 33.0)
# Result: SomeFuncRes(node='pCube1', attribute='tx', value=33.0)
As you can see, namedtuples are as even easier to ‘read’ than dictionaries when printed out. However, namedtuples give you dot-access to their contents.
print example.node
# pCube1
This saves a few characters: result.node beats result['node'] – but mopre important offers with far fewer opportunities for mistyped keys or open quotes.
However, namedtuples can also use old-fashioned indexed access too:
print example[0]
# pCube1
And you can even iterate over them if you need to, since a namedtuple is in the end just a slightly fancier tuple:
for item in example:
    print item

# pCube1
# tx
# 30
Namedtuples are easy to instantiate: You can create them using index ordering, names, or **keyword arguments. Names tend to be better for clarity, but if you’re expanding the results of other functions like zip() indices and double-starred dictionaries can be very handy. Having all three options allows you to create them in the most appropriate way.
print SomeFuncRes('pSphere1', 'ry', 180)
# SomeFuncRes(node='pSphere1', attribute='ry', value=180)
print SomeFuncRes(value = 1, node = 'pCube1', attribute = 'tz')
# SomeFuncRes(node='pCube1', attribute='tz', value=1)
from_dict  = {'node':'pPlane1', 'attribute':'rz', 'value':40.5}
# SomeFuncRes(node='pPlane1', attribute='rz', value=40.5)
Unlike classes or dictionaries, namedtuples are immutable; that is, they are read-only by default. This is usually a Good Thing(tm) for a result object, since data changing in mid-flight can lead to subtle bugs that may be very hard to reproduce. Immutability also makes them cheaper: they don’t require Python to do as much setup behind then scenes when a they are created, which can be significant in large quantities. They usually take up less memory as well.
This combination of features is tough to beat in a cheapo data-only class. If for some reason you need to upgrade to a real class instead, you probably won’t even need to change the code which reads your namedtuples: Python doesn’t care if result.node is a namedtuple field or a regular object field. For all these reasons, namedtuples are a great little tool for a lot of common data-bundling jobs. No strategy fits every battle, but namedtuples are an excellent - and often overlooked! – way to manage this very common (albeit not very interesting) problem and to keep your overall toolkit cleaner, more robust and easier to maintain.

Sunday, July 12, 2015


The wrap up

The beauty of working with code, even really simple code, is that you can build your own little universe out of bits and pieces contributed by thousands of other people – all without paying a dime or even asking them for help. From sharing a script off of CreativeCrash to downloading a huge open-source behemoth like Apache, any reasonably plucky individual can today make stuff that actually involves the work of thousands of anonymous others. It’s really quite a remarkable evolution in human history that so many people voluntarily give away their work for nothing, and (whatever else you can say about the internet era) it’s something to be proud of participatng in.
On the other hand…
Well, Say you are an Amish farmer and all your neighbors showed up to help you raise your barn, you’d certainly be grateful. But you might still be pretty annoyed if Hans from next door hung your barn doors so they stuck in the summer heat. Maybe old Hans worries more about keeping the barn warm than you do, so he prefers a tight seal: but that’s small comfort when you’re heaving on that handle in a muggy Pennsylvania morning.
barn raising

The internet abounds in excellent – and, amazingly, free – tools to help make your life easier. But they all started life as tools to make somebody else’s life easier. If your needs don’t line up perfectly with the needs of the original author, you’re likely to get a little gereizt.

The fact is that nobody writes all their own stuff: we all use other people’s code all the time (and, as sharing becomes more and more ingrained in coding, that’s only going to increase). All that sharing means that we constantly have to work with libraries and APIs that are useful and free and for which we know we should be grateful… but – like that sticky barn door – they drive us absolutely bonkers.

Wrap up

Not surprisingly, almost everybody ends up writing wrappers: code to help ease those nice-but-imperferct tools and API into a something that feels a little more natural. If you spend a lot of time on TAO or coding forums where people swap tips and advertise their wares you’ll see a huge variety of wrappers for all sorts of tasks: indeed, the wrappers often seem to outnumber the actual functional bits. Whether you call the job making things ‘more pythonic’ or ‘more functional’ or ‘cleaner’, its something we all feel compelled to do (and to share) from time to time.
It’s also easy to get cynical about wrappers. You see so many – and so many of them just taste-driven syntactic variations on each other – that veteran coders often reflexively shrug and ignore them. This is particularly true in Python land, where the malleability of the language encourages a certain degree of experimentation and re-casting. Because you can adapt Python to suit your tastes, the temptation to do so even when it’s not actually getting you much beyond style points is hard to resist.
The net result of all this customization and adaptation is messier than Christmas morning: wrappers everywhere. Whatever simplifications each individual wrapper gives you, the aggregate effect of so many different extra layers is overwhelming. At several times in the last decade I’ve sworn off wrappers and vowed to stick with vanilla python, straight-up maya.cmds and simple, linear code. A good code archaeologist could troll through my history and find several repeated periods of growth and die-offs in the wrapper ecosystem, like fossils trapped in shale.
where's pymel?
Where's pymel in there?


Wrappers, though, never really die off like the dinosaurs: they, in fact, more persistent as the cockroaches. And there’s a lesson in that.
Consider a classic case of wrapper-iteis: a system for making maya GUI less of a pain. Everybody writes that one at some point in their TA career (I’ve done it 4 times to my certain knowledge, not counting one-offs and half-assed, abandonware). When somebody feels compelled to spruce something up that much it’s a sign.
Sure, most gui wrappers are just a reaction to the clunky, wordy way that Maya expects us to pop up a window or make a button. And sure, most of those wrappers (some of my own, I hasten to add) really aren’t much better: they’re just shortcuts that cut down on the carpal-tunnel of cmds.textField(fieldname, q=True, text=True).
Sure, saving keystrokes is nice, but over the life of a piece of code the time spent typing is a tiny fraction of that spent reading, debugging and refactoring: that you could (and probably should) just bit the bullet on. But so many persistent, repeated efforts to fix a problemare a symptom that something worse than wordiness is the problem. Wrapper-itis really runs rampant when the toolkit that is simply not adequate to the job at hand. If you have to spend a lot of time thinking about the implementation details instead of the problem you really want to solve you’re not just wasting keystrokes: you’re wasting precious thought and time.
So I’ve been trying to soften my anti-wrapper stance. Sometimes it’s better to actually solve a recurring problem instead of papering it over; sometimes it’s worth taking the time to be in a position to write the code you need to write instead of the code you’re forced to write. Sometimes.
Which of course raises the question of how you can identify those situations and distinguish between a real need for better abstractions and a plain old peevish desire to avoid boilerplate.


The prime way to distinguish between a ‘wrappable’ problem and a purely syntactic one is to consider the needs of the person who’ll be picking through your code after you’be been run over by a bus.

When your replacement comes to look at your code, will they see something that seems to clearly express the problems you were trying to solve? Or just code that clearly expresses your preferences for a particular set of formatting options and code idioms?
Here’s a little bit of code that reads some information from a database in order to add some ‘credits’ to a time account:
def replenish(user):
    if user is None:
        return False

    with connect_db() as db:
        repl = db.execute("SELECT replenished FROM users WHERE name LIKE ? AND DATE (replenished) <  DATE ('now')", (user,))
        recent = repl.fetchone()

        if recent:
            daynum = db.execute("SELECT strftime ('%w', 'now')").fetchone()[0]
            daynum = int(daynum)
            repl_amount = db.execute(
                "SELECT sun, mon, tues, weds, thurs, fri, sat FROM replenish WHERE users_name LIKE ?", ( user,))
            refresh = repl_amount.fetchone()[daynum]
            cap_amount = db.execute("SELECT cap, balance FROM users WHERE name LIKE ?", (user,))
            cap, balance = cap_amount.fetchone()
            new_balance = min(cap, refresh + balance)

            db.execute("UPDATE users SET balance = ? , replenished = DATE('now') WHERE name LIKE ?", (new_balance, user))
        log(db, user, "replenished with %i credits" % new_balance)
the basic logic is pretty simple. Stripped all the fluff, you merely need to:
  • connect to the database
  • ask the database the last time the user was topped off
  • if the user hasn’t been replenished today, get the amount due
  • add the amount to the user’s account

That’s just four basic ideas. but it takes more than 20 lines to express them.
Far worse, the key logical linkages of the operation are implied, not stated.For the code to make real sense you need to know or deduce that the users table has a field called replenished which stores the last day when the user was topped off; that the ‘replenish’ table has seven fields containing the top-off numbers, arranged Sunday throguh Saturday; and that the user table stores both the maximum number of credits to store and the current balance of credits. The implementation of our simple, 4-step idea only makes sense with all of that special knowledge. It’s further obscured by time saving shortcuts, like using the actual column index in a database table to check today’s value. That may save a couple of lines but it renders the code even harder to parse. And, of course, there are syntax quirks big and small, particularly relating to the creation and formatting of the SQL.
This code works fine; it’s even fairly economical and readable for what it does (for a given value ‘economical’) But it’s not the kind of thing you’d ever want to inherit; it makes sense to me, because I wrote it and I remember (at least today) what I was thinking about when I did. But some future inheritor (heck, even me a year from now) will have to think long and hard about what really ought to be a simple process. The whole thing is bogged down in implementation details that obscure the intent of what’s going on. Really good code often reads almost like pseudo-code. This does not.
To illustrate what a good wrapper can do, here’s the same code using an ‘object relational mapper‘ called peewee: it’s a wrapper around the SQL backend that map database operations onto classes and allows you to focus on the logic instead of the mechanics:
def replenish(user):
    if user is None:
    with connect_db().atomic():
        today = datetime.now()
        today_name = now.strftime("%A")

        updatable_user = User.get(name=user, replenished  < today)
        today_update = Replenish.get(name = user, today_name > 0)
        if updatable_user and  today_update:
            refresh = getattr(today_update, today_name)
            new_balance = min(updatable_user.cap, refresh + updatable_user.balance)
            updatable_user.balance = new_balance
            Log.create(user= user,  message = "replenished with %i credits" % new_balance)
That’s a significantly cleaner bit of code to read. It still requires some outside knowledge but the intention is much more clearly expressed and the message isn’t drowned out in quotes and parens. An ‘offscreen’ benefit, given the way peewee is structured, is that backtracking to the User and Replenish classes would tell the rest of the story pretty straightforwardly without a ton of comments. Only a handful of lines are needed to munge data into the right forms, and the code almost reads like the summary.
That’s a good example of how wrappers can help: saving keystrokes is nice but clarifying the real meaning of the code is priceless.


Well, maybe not exactly price-less. All wrapper code comes with a cost: there are new rules to learn and, probably, new bugs to encounter. If the wrapper uses odd conventions, unusual data formats or is simply slower than hand rolled code it may still be a bad bargain. Nonetheless, this example shows wrappers can be more than just a protest against awkward syntax and API’s that don’t match your taste. Ultimately wrappers are a perfect microcosm of what all coding is about: the search for a clearer understanding of the problem you’re trying to solve.
So if you’re thinking about writing a wrapper, ask yourself this: does the code you want to write teach you something about the problem your solving? Or does it just save you a few keystrokes? Typing is a pain, but you’ll spend a lot more tine looking at your code than you ever will typing it. So don’t focus on just counting lines or syntax: focus on whether the wrapper helps you understand the problem better. If the wrapped code reads like a description of your thought process, you’re on the right track. If it’s just getting you back to that TwitchTV stream on your second monitor a few minutes earlier it might not be worth your time.


I used an ORM for my example because it provides such a powerful example of code that’s not bogged down in syntactic complexities. There is, however, a classic internet flame war about ORMs that I’m glossing over, with nerd rage aplenty for friends and foes of ORMs. Background here if you care.

Saturday, June 6, 2015

Porting Spelchek to Boo

What could be more ghostly than a post mortem?

If my last post about Boo piqued your interest, but you haven’t had time to do a deep dive into the language to see for yourself, I’ve posted a version of the Spelchek Python spell checker module converted to Boo so you can see the similarities and differences between the two languages.
The original Python version is here and the Boo port is here. As a good indication of what I’ve been saying about the economy of Boo syntax, the Boo version comes in at almost the same size as the Python original (5.05 kb for Boo and 4.95kb for Python) and pretty much the same number of lines – I haven’t done the excersize of converting it to C# for comparison but I’d guess the C# version would come in at about half again as much typing.
Looking at the code, significant chunks are almost identical: the logic is pretty much the same and the type annotations are the only real difference.
def add(word as string, pri as int):
    Adds <word> to the dictionary with the specified priority. 
    _DICTIONARY[word.ToLower()] = pri
def add(word, priority=4):
Adds <word> to the dictionary with the specified priority (default is 4)
    _DICTIONARY[word.lower().strip()] = priority
which is pretty much identical.
The tricky bit of the conversion was the routine which generates possible variants of the word - it generates variants of a word by transposition and deletions. In Python:
def first_order_variants(word):
    return the obvious spelling variants of <word> with missing words, transpositions, or misplaced characters
    splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
    deletes = [a + b[1:] for a, b in splits if b]
    transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b) > 1]
    replaces = [a + c + b[1:] for a, b in splits for c in _ALPHABET if b]
    inserts = [a + c + b for a, b in splits for c in _ALPHABET]
    return set(deletes + transposes + replaces + inserts)
As you can see the first list comprehension, splits, generates a lists of pairs representing places where the word could be broken up, so that ‘cat’ produces [("c","at"), ("ca", "t")]. The other comprehensions use that list to try inserting, deleting or transposing letters to guess what the user might have really been typing.
In Boo, the tricky bit was getting the compiler to recognize that the splits list contained a pair of strings and that all the lists produced by it would also be lists of strings. Porting the python code directly wouldn’t work because Boo would see splits as a list of type object instead of deducing that it was a set of string pairs.
Here’s the Boo version, which as you can see is recognizably the same but is clunkier than the Python, due to the need for typing,
def first_order_variants(word as string):
return the obvious spelling variants of <word> with missing words, transpositions, or misplaced characters
    _stringList = Boo.Lang.List[of (string)]
    _strings = Boo.Lang.List[of string]
    pair = {w as string, i as int | (w[:i] cast string, w[i:] cast string)}
    splits = _stringList((pair(word, i) for i in range(len(word) + 1)))
    deletes  = _strings((a + b[1:] for a as string, b as string in splits if b))
    transposes  = _strings((a + b[1] + b[0] + b[2:] for a as string, b as string in splits if len(b) > 1))
    replaces  = _strings((a + c + b[1:] for a as string, b as string in splits for c in _ALPHABET if b))
    inserts  = _strings((a + c + b for a as string, b as string in splits for c in _ALPHABET))  

    result = HashSet[of string]()
    for chunk in (deletes, transposes, replaces, inserts):

    return result
To clean it up I added two ‘aliases’ up at the top, since the Boo syntax for declaring typed containers is hard to read (‘List[of string]’): so _stringList is a shortcut for ‘list of string arrays’ and _strings is a shortcut for ‘list of strings’.
The variable pair contains a lambda (ie, an inline function) using Boo’s idiosyncratic syntax: you could mentally rewrite it as
def pair(w as string, i as int) of (string):
    return (w[:i], w(i:))
or in other words “give me a string and an integer, I’ll return a pair of strings split at the index you gave me.”
With those helpers in place the logic is identical, but it is harder to follow because of all the type-mongering. I’m pretty sure there are more elegant ways to do this withgout being so wordy, but I’m not an expert.


The point of the experiment was to see how hard the Python > Boo translation would be. This is an application where types actually matter a good deal, since all my values are strings and I need to be able to do string operations like joins on them – if all I was doing as asking questions of them things would have been more Pythonic (though probably slower as well: one of the reasons we need those types is to get the compiler to help us speed the code up).
While this is hardly a demanding application, it is at least a proof-of-concept for the idea of prototyping in Python and then selectively porting to Boo isn’t completely nuts.

Sunday, May 31, 2015

Boo Who?

Boo Who?


Did I scare you?

Evidently somebody’s scared: the Boo language, which has been a part of Unity for several years, is going to be removed from the Unity documentation in favor of C#.

The reason is pretty simple, as this graph explains:

For a lot of Unity developers (99.56% of them, apparently) this is non-news; Boo never really garnered much of a following in the Unity community. For new developers and recent grads, C# is an easy and very well documented option; for former web debs moving to mobile apps, UnityScript feels JavaScript-y enough to ease into a new environment. Boo, on the other hand, never got much traction: it’s got a small but passionate community but it never garnered enough momentum to break out of its niche.

Boo Hoo

Now, I’m kind of a sucker for hopeless causes, so almost inevitably this news inclined me to revisit Boo, which I’ve toyed with a few times but never really tried to learn. I had to write a lot of C# for Moonrise and it made me long for the clarity and concision of Python. Even though C# is a perfectly capable language with lots of modern features (closures, firest class functions, etc) it’s still very chatty. The tantalizing promise of Boo – not completely fulfilled, but pretty close, is that it combines both: the performance, runtime type safety, and intimate access to Unity that C# offers in a language not deformed by punctuation and rendered ridiculous by overly wordy syntax.

Here’s the aesthetic differences in a nutshell:


import UnityEngine

class JumpingMovementController(MonoBehaviour): 

 _HORIZ = 'Horizontal'
 _VERT = 'Vertical'
 _JUMP = 'Jump'
 _Momentum = 0.0
 _Gravity = 2.0
 public _Speed = 1.0
 public _JumpSpeed = 1.5

 def Update(): 
  frame_speed = _Speed * Time.deltaTime 

  if transform.position.y == 0 :
   _Momentum += Input.GetAxis(_JUMP) * _JumpSpeed

  up =  _Momentum * Time.deltaTime
  left_right = Input.GetAxis(_HORIZ) * frame_speed
  forward_back = Input.GetAxis(_VERT) * frame_speed
  transform.Translate(Vector3(left_right, up, forward_back), Space.Self)

 def LateUpdate():
  if transform.position.y > 0:
   _Momentum -= _Gravity * Time.deltaTime;
   _Momentum = 0;
   vp = Vector3(transform.position.x, 0, transform.position.z)
   transform.position = vp


using UnityEngine;
using System;

public class JumpingMovementController(MonoBehaviour)

    const static string _HORIZ = "Horizontal";
    const static string _VERT = "Vertical";
    const static string _JUMP = "Jump";
    var _Momentum = 0.0f;
    var _Gravity = 2.0f;
    public var _Speed = 1.0f;
    public var _JumpSpeed = 1.5f;

    void Update()
        var frame_speed = _Speed * Time.deltaTime 

        if (transform.position.y == 0) 
            _Momentum += Input.GetAxis(_JUMP) * _JumpSpeed;

        var up =  _Momentum * Time.deltaTime;
        var left_right = Input.GetAxis(_HORIZ) * frame_speed;
        var forward_back = Input.GetAxis(_VERT) * frame_speed;
        transform.Translate(new (Vector3(left_right, up, forward_back)), Space.Self);

    void LateUpdate()
        if (transform.position.y > 0) 
            _Momentum -= _Gravity * Time.deltaTime;
            _Momentum = 0;
            vp = new Vector3(transform.position.x, 0, transform.position.z);
            transform.position = vp;

I just can’t shake the feeling that the first code is something I don’t mind reading and writing while the latter is a chore. It’s also a whopping 45% more typing for the same result. And that delta only gets bigger if you want to try something a little more complicated: Boo supports offers the same list comprehension syntax as Python, so you can write:

    addresses = [(x,y) for x in range(3) for y in range(3)]

where in C# you’d either get 6 lines of for-loops and nested brackets, or you’d have to use Linq. Even in the most compact form I can manage it’s still much wordier:

        var xs = Enumerable.Range(0, 3).ToList();
        var ys = Enumerable.Range(0, 3).ToList();
        var addresses = (from x in xs
                         from y in ys
                         select new Tuple<int,int>(x, y)).ToList();      

to get to the same place.

Why Boother?

A hardcore programmer might object that this is “all just syntax”. That’s true – but since my everyday experience of working with a language is about 90% syntax I don’t feel like it’s fair to dismiss that concern as if it were irrelevant. That said, it can’t be denied that modern C# 4 includes many language constructs that earlier versions of the language lacked: things like var inferred variables, lambdas, closures, and named default arguments. These things all help make the code less chatty and more legilble: If you’re a good C# programmer you can write very terse, expressive code that’s not absurdly wordy.

Apart from those stupid curly brackets.

On the other hand, the “culture” of the language was set in place before those kinds of features were added. The C# ethos definitely prefers the verbose to the understated, the extremely explicit to the implied.This isn’t a terrible decision – like Java, it’s a language designed to be used by huge teams of enterprise programmers working on titanic projects where only a few people see the whole project scope and most coders are locked away in cubicles on the 18th floor working on isolated modules to be used by other coders they will never meet.That’s why C#’s obssession with visibility management and highly-specified behavior makes sense.C# is a language that’s all about apportioning blame: it forces everything to be very explicit so you know which of the 6000 drones working on your enterprise app to bleame when things go wrong.

In the context of a small game or a solo project, though, the Pythonic ethic of “we’re all adults here” seems more natural and productive.Besides, for hobby projects and one offs fun is a totally legitimate concern: making minigames is something that gets crammed into nooks and crannies left over by work, kids and keeping the house from falling down around my ears.So fun is a totally legit criterion for selecting a language.

And Boo is definitely more fun than C#.

Boo Who?

Like many Pythoneers, I’ve always nursed a secret hunger for the magical button that would take all my tight, laconic Python code and magically make it perform like a “real” language. Boo is not the magic button, but it’s a pretty good preview of what that might look like. As you can see from the code samples above, it looks and feels a lot like Python but under the hood is has similar performance and compile-time constraints to C#: in other words, Boo code can run as much as 20X faster than Python.

That’s what makes Boo so tantalizing. It is almost Python, but you can write Unity games, Winforms apps, or even cross-platform DLLS with it. Plus, since Boo runs on the same dotnet CLR as C#, it runs on any platform with the DotNet framework or Mono installed, so a compiled Boo program can run on Windows, Macs, or Linux boxes. There’s even an interactice shell so you can do one-offs and experiment, just like Python. But – unlike Python – you get the performance gains that come from a compiled, statically typed language.

Typing and the compiler are the key difference between Boo and Python. The a compiler makes sure that all of your variables, return values and method signatures line up and uses that knowledge to optimize the final runtime code for you. You can do this in Python:

fred = 1 
fred = fred +  1
print fred
# 2
fred = "fred"
fred = fred + " flintstone"
print fred
# fred flintstone

In Boo, however, you’ll get an error when you try to change fred from an integer value to a string:

fred = 1
fred = fred +1 
fred = "fred"
#ERROR: Cannot convert `string` to `int`

In old-school C#, this was made obvious that all variables and to declare a type:

int fred = 1;

In more modern C# you can use the var keyword to make the compiler guess what type you want based on the initial input: when you give it

var fred = 1;

it sees that fred has an integer in it, and treats fred as an integer from then on. If you assign the variable with the result of a method call or another variable, C# uses the expected type of that return value to set the variable type. Boo does more or less the same thing: it uses the assignment value to guess the type of a variable. You can specify it explicitly if you prefer by using the as keyword:

barney as string
barney = "barney"   #OK

The same syntax is used to specify inputs in methods and returns:

def bedrock (name as string) as string:
    return name + "rock"

def inferred_return_type(name as string):
    return name + "inferred"
    # if the compiler can guess the output type
    # you don't need to 'as' it

Once you get out of the habit of re-using variables with different types, this is usually not too bad: 95% of the time the inference “just works” and you can write code that focuses on logic and good design instead of worrying about the types. The other 5% of the time, however, is often very frustrating. It’s particularly tough when Boo’s Python-like, but not exactly Python behavior trips you up. Consider this little puzzle:

def sum (values as (int)) # expect an integer tuple
    result = 0
    for v in values:
        result += v
    return v

# works as expected for this case:
example = (1,2,3)
# 6

However it can be broken if your input list isn’t all of the same type:

example2 = (1,2,3,"X") 
# ERROR: the best overload to the method sum((int)) is not compatible with the argument list '((object))'

That’s not entirely shocking: the compiler saw a mixed list in example2 and return and array of type object (as in C#, object is the base class of all types). So it is right to reject that as an argument for an int-specific function. Here’s where it gets odd:

example2 = (1,2,3,4)
sum (example2)
# 10

This time the compiler “helpfully” casts that array to an array of ints because it can. This is not a crazy behavior, but it’s definitely going to raise some issues where test code works fine but production code contans bad values. The only way to be sure that things are what they seem is to force them into the right types at declaration time:

example3 as (int) == (1,2,3,4,5)
# 15

example3 = (1,2,3,"one hundred")
# ERROR: Cannot convert `(object)` to `(int)`

This example highlights both the usefulness and the limitations of type inference: If you want a statically typed language (and all the compiler tricks that make it speedier than Python) you do have to think about how to manage your types. There’s no way around it. If you’ve got enough C# experience, you can look at Boo as a neat way of writing speedy, statically typed code with less typing and more syntactic freedom - but if you’re looking at it from the standpoint of loosey-goosey Pythonista it can seem like a lot of hurdles to jump.

My (unscientific) impression is that a lot of people from the Python world come over to Boo and the familiar look of the code gives them a false sense of security. It’s easy to write simple bits of code until the subtleties of type management bite you in the behind, and then to give up in frustration when things seem to get cluttered and uptight.

It is, however, part of the territory: lots of other tools for speeding up Python such as Cython expect the same kind of attention to variable types: here’s a sample from Cython

def f(double x):
    return x**2-x

def integrate_f(double a, double b, int N):
    cdef int i
    cdef double s, dx
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a+i*dx)
    return s * dx

which is just as finicky as C# or Boo.

For me, at any rate, spending more than a year doing C# as a regular part of work made fiddling around with Boo much easier and more productive. The type management hassles strike me as inevitable, or even natural, after a year of typing verbose C# variable types. On the other hand the cleanliness of the layout, the lack of extraneous punctuation, and the clealiness of list comprehensions and Python style loops never gets old.

While there are plenty of minor gotchas, and a few important high-level rules that can’t be forgotten, Boo development flows in with an almost Pythonic fluency. If you put in the time to figure out the type inference behavior and add the annotations, you can get code thats almost as flexible as Python and almost as performant as C# – which, for my kind of pet projects is a great compromise.

Boo-ty is in the eye of the beholder

TL;DR: I’ve gotten pretty fond of Boo. Above all, it serves me well for noodling around in Unity where the API is mostly identical but the logic is cleaner, shorter and easier to read than the same code in C#. Translating the docs is rarely more than trivial, and the very narrow scope of a typical Unity code chunk keeps me from any of Boo’s rough edges.

Another hurdle for many Pythonistas, though one which does not matter in the the context of Unity games, is the lack of the Python standard library. About 70% of what you can do with the ‘batteries included’ in Python can, however, be replicated using the dotnet Base Class Library if you’re running Boo on a Windows box (on Linux or OSX the percentage is lower: Mono has its own base class library but it’s not a complete replica of the one from Microsoft). For many tools tasks and projects, this is more than enough: you’ll be able to read and write XML, to decrypt JSON, to talk to an http server and so on although the function names and APIs will vary. I have to admit I prefer the Python toolkit to the dotnet one, which reflects the same bureaucratic mindset that I dislike in C#’s design, but it’s still a big, robust set of tools. You can also use anything that’s available as a dotnet DLL. Almost anything advertised as a usable with C# will work with Boo.

All That said, I’d definitely think twice before basing a commercial Unity project or a critical pipeline component on Boo. There does seem to be a small but measurable perfromance penalty compared to C# (the performance is, however, pretty much on par with that of UnityScript). More importantly, the Boo’s biggest weakness is documentation: with a small community and (from now on) no docs on the Unity site, finding your way around in the language at first is pretty awkward. The documentation is a sporadic, volunteer effort with some glaring holes – it doesn’t help that Google still sends you to the moribund Boo site on codehaus instead of the current docs, which are in a Github Wiki. The language is officially at version and hasn’t incremented in a long time: it’s still getting commits from the original author and few other devs but it’s a much smaller project than, say, IronPython. In short, it’s a cool language that has not found it’s audience yet, and unless it does it will remain a niche player.

Still, it’s pretty cool too. If, after those caveats, it still sounds interesting, you’ll be relieved to know that Boo is not really ‘going away’: For the forseeable future, the language will still work in Unity, Boo, like C# and UnityScript, runs on Mono, much as Java runs on the JVM. Unity doesn’t distinguish between the source of Mono assemblies: you can still use Boo, and even more exotic dotnet languages such as F# (though not, alas, IronPython!) in Unity today. The only practical result of Unity’s decision to sunset Boo support is the disappearance of the Boo documentation from the Unity website – which , to be honest was rarely adequate – and the lack of a ‘create Boo script’ menu item. Dropping a boo script into your assets folder, however still creates runnable code, and it should continue to do so for the forseeable future.

There’s some question about how Unity’s new cross-platform compiler technology, IL2CPP will handle Boo. In principle, since it compiles the byte code produced by Mono rather than the original source, it too should work with any CLR language, be it Boo or F# or what have you. I’ve been able to compile Boo code to Unity WebGL games, which use IL2CPP, without obvious problems although I haven’t done anything like a scientific test. It’s not beyond belief that bugs which occur only in non-C#, non-UnityScript IL code may go unfixed. And, of course, it’s impossible to say what will happen after Unity 5 – technology, particularly in games, moves too fast for confident future predictions. However, It seems pretty clear Boo will be working in Unity for a while to come even though it is being demoted from “officially supported” status to the same kind of l33t hacker underworld as functional languages.


If you’ve got Unity installed already, you’ve already got everything you need to play with Boo. Just create a text file with a “.boo” extension inside a Unity project and you can write Unity components in Boo. If you don’t have Unity, You can also download Mono directly, which installs MonoDevelop and Boo automatically.

If you’re not fond of MonoDevelop – an editor for which I have… mixed… feelings – You can write Boo using Sublime Text, which has a Boo syntax higlighting package and can run Boo compiles for you.

If you’re curious but don’t want to take the plunge, you can see the language for yourself and play with it online, using this interactive repl

The documentation – which (be warned!) is incomplete and not always up to date – is in the Boo Project GitHub wiki. There’s an older site at boo.codehaus.org which is tends to show up on the Google results but has mostly been ported to the github. In cases of conflicting information, the GitGub wiki is likelier to be right. There’s also a Google Group and a small pool of questions on StackOverflow

If you’re a hardcore type, you can also download and rebuild the source for the entire Boo language yourself from GitHub. Lastly, you might want to check out BooJS, a project which aims to compile Boo into JavaScript.