Sleeping Cyborg

Jonathan David Page talks about whatever he happens to be thinking about. Sometimes other people join in.

Email · @parathetic (Twitter) · @jdpage (Github)
Subscribe to feed


A collection of cool people and projects.

A Layperson's Guide to the Heartbleed Bug

by on 9 April 2014
in , , , ,
with some comments, maybe.

At this point, a lot of you have probably heard about the Heartbleed OpenSSL bug, which has come into the public eye to the point where the BBC reported on it. The implications have been fairly well-covered in the media: disclosure of passwords, disclosure of private keys, etc. In other words, Bad Stuff. This article aims to cover the technical details of the bug in a manner suitable to non-technical people. It may get dry. However, no previous knowledge of computers should be required.

Read more…

Some Finals Week Thoughts

by on 12 December 2013
in , , , , ,
with some comments, maybe.

It’s finals week, a magical time of stress and brain death. The irresistible draw of the blogosphere calls to me, however, so here I am, writing a lazy list post consisting of some thoughts I’ve been having lately. You’ve probably already seen most of these if you follow me on Twitter. Oh well.

Read more…

Look ma, I’m a writer! & my glorious return to blogging.

by on 1 November 2013
in , , , , , , , , , , , ,
with some comments, maybe.

Regarding to the "posting links regularly" thing I mentioned in the last post, I need to make a remark:




hahaha. ha.

Okay, now that that's over with, I can get down to business.

Read more…

Digest for 5 April 2013

by on 5 April 2013
in , , , , , , , ,
with some comments, maybe.

Today marks 50 years until first contact with the Vulcan.

The Mozilla Javascript team posted a really interesting article explaining how the SpiderMonkey engine works and what they just did to make it better.

I finally got around to reading an article which turned out to be one of the better analyses of the social media phenomenon. I call it an analysis because it doesn't say "social media is a Bad Thing", like a lot of the more sensational article-writers (including myself, at times) do. It talks more about how some patterns of social media are Bad Things, which is more constructive since it can lead to ways to fix those patterns.

Bad Catholic published a guest post with perhaps the single best explanation of the Catholic obsession with the Virgin Mary that I've ever read.

And finally, I've decided that I am going to start posting digests (like this one) with links to interesting articles and some remarks on them.

Approximating Pi Redux

by on 20 December 2012
in , ,
with some comments, maybe.

No, it isn’t Pi Day, or anything resembling it. I was editing some stuff, and I noticed the original “Approximating Pi” article. I’d been meaning to rewrite the code since I learned about continued fractions, and since I didn’t have anything better to do, This is a lie. I do, in fact, have a number of things to do that various people might define as “better”, but I didn’t feel like doing any of them. I felt like sitting around in my bathrobe and writing Python. I decided to do so. A detailed explanation follows the source code, since this is less mathematically facile than the previous version.

Read more…

A Simple(ish) Explanation of Haskell Function Signature Madness

by on 22 August 2012
in , ,
with some comments, maybe.

A clarification regarding my use of the word "simple(ish)": I am assuming, firstly, that you are already comfortable with programming in an imperative language such as C, Java, or Python, and secondly, that you already know at least a little Haskell.

Haskell type signatures, to the uninitiated, are a little odd. Take the following simple two-argument function:

plus :: Int -> Int -> Int
plus a b = a + b

If you're coming from an imperative language, you might be tempted to read that signature as "this function takes an Int and an Int and returns an Int". That'll do in most cases, but it isn't really true. And the fact that it isn't true is a really cool feature of Haskell.

The little -> arrow is a right-associative operator. So you whould read Int -> Int -> Int as Int -> (Int -> Int), which doesn't help anything at all, because now it looks even worse. And now here is the kicker: all functions in Haskell take exactly one argument. One. Even our plus function there. Which is odd, because it sure looks like it takes two arguments. The thing is, Haskell does a little magic trick for us (which isn't really magic). This magic is called currying.

The trick is that when you do plus 2 3, two things happen. First, 2 is applied to plus. Application is a fancy way of saying that an argument is given to a function. This application results in a new function, which also takes one argument. 3 is applied to that function, which one might notate as (plus 2), and returns an Int, 5.

In short, plus 2 3 is the same as (plus 2) 3. So the Int -> (Int -> Int) means that plus is a function which takes an Int, returning a function which takes an Int, returning an Int.

So why is that useful? Well, consider the builtin function map, which has the signature:

map :: (a -> b) -> [a] -> [b]

Basically, it takes a function which takes a thing of type a and returns a thing of type b, and an array of things of type a. It then spits out an array of things of type b, which is generated by applying the function to every argument the array of things of type a.

So map (\b -> plus 2 b) [3 1 4 1 6] returns [5 3 6 3 8].

Now, remember that (plus 2) returns a function, right? So you could also do:

map (plus 2) [3 1 4 1 6]

and get the same result.

tl;dr: Calling a Haskell function with not enough arguments basically returns what you might think of as a half-called function. It's got some arguments already, you just need to supply the remaining ones. And it's just another function.

Approximating Pi

by on 23 July 2012
in , ,
with some comments, maybe.

You didn't get a post for July 22, so here's some Python which finds approximations for pi.

Update 20 Dec 2012: See also the revised version utilizing continued fractions.

Bash code for finding active IP addresses

by on 10 May 2011
in , ,
with some comments, maybe.

I sometimes have to ssh into my brother's laptop. This can be a painful procedure for all involved, because I need to know the IP address, but he can never remember the command to get his computer to tell him. I normally solve this by just pinging everything from to or so, and then attempting to ssh into any that respond, but the other day I got slightly fed up with this.

So without further ado, I give you:

for i in {100..120}
do ping -c 1 192.168.1.$i | \
    grep -B 1 ' 0% packet loss' | \
    sed 's/^.*\(192\.168\.1\....\).*$/\1/g;/192/!d'

This "simple" bash code just loops through the address range given on the first line, pings each one, uses grep to find ones that responded, and then sed to format nicely. I could probably do it all in sed, but hey. If you think of any improvements message me and I'll add them.

Password Storage

by on 8 May 2011
in ,
with some comments, maybe.

EDIT 2012-01-29: As it happens, the information in this article relating to the use of SHA2 as an appropriate password hashing algorithm was incorrect. I've replaced it with accurate information. The original text can be found in the footnotes. See Password Storage 2: Electric Boogaloo for more information.

So now that school is mostly over, I'm going to use this blog for what I originally intended--namely, talking about programming. This was originally going to be a devlog entry, but it mysteriously turned into an explanation of password storage instead. It also gives me the opportunity to make rude comments about Sony like all the cool bloggers, because they were very bad at password storage.

I was working on authentication for our project with Prof. O today, and while I was waiting for the development environment to load, I typed this out. Last night I got the passhash and salt fields set up, and did some general research. The fruit of my research is a hopefully relatively simple explanation of how password storage is done.

It seems pretty obvious to do this: just store the username and password. You're done! Of course, if an attacker gets your database (cough Sony cough) and your users used the same password for their email address (doubtless stored nearby), a dismally common practice, then they're in a bit of trouble, aren't they?

The accepted solution to this is hashing the password before storing it. Hashing is done by a hash function, which takes an input and produces a corresponding, often shorter, output, called a hash. However, it's one-way; ideally, you shouldn't be able to get the input back given the output. MD5 and SHA1 are common hash functions used for integrity checking1; for password storage, bcrypt is an appropriate hash function.2

The other thing about a hash is that, while many different pieces of data can have the same hash (SHA-512, one of the largest hashes, only has 512 bits; one sixteenth of a kilobyte). However, the probability of two sensical values having the same hash is vanishingly small.3

Of course, attackers have got a way to combat this: rainbow tables, which are simply a massive list of all possible passwords matched to their hashes. Do a lookup of the hash on the table, and bang, you have the password (after a few minutes; these tables are absolutely massive, and take a while to search through). This is quite clearly not a good thing at all, so we do one more thing to protect the passwords--salting them. Basically, this means attaching a piece of random or pseudorandom data called a salt to the password before hashing it, and storing the salt along with the password hash. This does two things: firstly, it means that even if two passwords are the same (assuming they have different salts), the hashes will be different (meaning that if an attacker breaks one, the other is still safe), and secondly, it can magically make many rainbow tables completely useless, by making sure the password+salt combination is not likely to be on the table.

To authenticate, simply hash the password and salt together as you did to store it, and compare against the hash in the database. If they're the same, the password was correct. If not, it was wrong.

The bcrypt hashing algorithm actually has built-in support for salts -- you have to pass in both a plaintext and a salt.4

Finally, it might simply be best to avoid attackers getting hold of your database in the first place. However, in the interest of mitigating damage and multiple layers of protection, good password storage is a must.5

  1. "used for integrity checking" added on 2012-01-29 to clarify the appropriate purpose of the algorithms. 

  2. Edited on 2012-01-29. Originally read "for cryptography purposes, I prefer SHA-512 (a form of overkill SHA2)." MD5, SHA1, SHA2, etc. are not appropriate for password storage due to the fact that they are designed to be computed quickly, facilitation brute-force attacks. This makes them suitable for certain tasks, but password storage is not one of them. 

  3. The following text was deleted on 2012-01-29:

    Say your password is "@b3L1nc0lnR0%". The SHA-512 hash of this is:

    2f66 9619 ffc8 49a3 5049 a0f4 b050 1fa0 880f 05b4 13cf e494 c2e1 c941 3c0f 5e47 0fb8 81be 9d51 6571 5e27 c525 1076 e906 72e2 dd59 d615 c0c5 d9fc 6d6c d098 8feb

    Now, the other pieces of data that match that will probably not (and by "probably not" I mean "practically never") be appropriate passwords. They will probably be 200 pages of garbled bytes."

    It was deleted due to redundancy and the fact that use of SHA512 was misleading. 

  4. Paragraph added on 2012-01-29. 

  5. Deleted following paragraph on 2012-01-29 due to excessive smugness. It originally read: "And really, it's not that hard to implement. Most standard libraries make this dead easy." 

Mercurial current branch in your Bash prompt

by on 10 April 2011
in , , ,
with some comments, maybe.

Edited 2011-06-17: Added Mac OS X support

(Make sure to read the whole thing; my first solution is slow and will cause pain and anguish for the user.)

So I thought it'd be useful to have the current branch of a local Mercurial repository in my bash prompt. This is easy enough, because we have hg branch. The result:

PS1='\u@\h:\w$(hg branch 2> /dev/null | sed "s/.\{1,\}/ [hg:\&]/")\$ '

Basically, it tries to grab the current branch, discards any error, and then sends the results off to sed. The sed script basically says "if there's at least one character (i.e. a repo was found), put it into the format [hg:branchname], otherwise return nothing".

The result:

jonathan@kippersnacks:Code/active/awesomeproject [hg:default]:

It works perfectly, except for one thing.

It's a bit slow. As in perceptibly slow. As in irksomely slow.

Mercurial is not the speediest of version control systems, and this command is no exception. While the delay is acceptable for commits, pushes, and pulls, it is not when attempting to display a bash prompt. One may choose to blame this on Python, but that's sort of irrelevant.

Well, it turns out that the name of the current branch is stored in reporoot/.hg/branch. This is terribly convenient, as we can just cat it—oh wait. We also have to recurse up through directories. I'm sure this is quite possible in bash, it's just it would be rather unwieldy, and possibly evilslow. And I'm not a bash wizard.

Git is quite fast, and has a similar tool for this very purpose—written in C of course. So I took a few minutes to dash off a C program which does the job in about 50 lines. You can find it here: hg-ps1.c

Compile it with gcc -o fasthgbranch hg-ps1.c, put it in your $PATH, and use something like

PS1='\u@\h:\w$(fasthgbranch)\$ '

as your bash prompt in ~/.bashrc (or ~/.bash_profile if you're on OS X).

Lightning fast.

It's basically public domain, and you can do whatever you want with it, so enjoy. If you republish or redistribute it, it'd be nice if you credited me, but I don't mind too much if you don't for some reason. If you take credit for it yourself you are an evil scumbag plagiarist but there really isn't much I can do about that.