Archive for November 2009

The week that was

Related Posts:

url shorteners

Shouldn’t a URL ‘shortener’ service make the URL shorter? perhaps even a google search term would have been appropriate, but then again that might have used up enough energy to boil a kettle*.* I’ll have to explain this one later..

Related Posts:

ozchi 2009

I’m heading down to Melbourne tomorrow for the 2009 OZCHI conference with Frank Maguire, Bert Bongers and Dan Hill, to attend and present (with Frank) some research work we’ve done recently.  I’m looking forward to it, one of the add-ons for this conference is a workshop on Street Computing organised by the ubiquitous Marcus Foth.  Dan and Andrew will be presenting some of their research, as will Bert – so it’s shaping up to be a really great day tomorrow.  To anyone who’s going to be at the workshop, I look forward to meeting you, the same goes to anyone else floating around the conference…Catch you later in the week, hope this bizarre Sydney weather doesn’t get the better of you!J

Related Posts:

The week that was

  • I have to say that $1.30 for 9 tissues is a bad deal. I don't care how many ply, it's just not on. lesson learnt! #
  • [Via ZeFrank] yet again – a stop motion piece that re-invents the medium http://bit.ly/1nWyV8 #
  • "check yourself before you… Trek yourself" http://bit.ly/4bIUzS Dan Meth remakes the star trek prequel in the 90's #
  • i just noticed that bit.ly/autumnflow by lior (2005) and bit.ly/maninashed by nick drake (1968) are almost identical guitar-wise #
  • something for everyone.. go check out heyoscarwilde http://bit.ly/43GlLG, this one is my current fave – Dr Seuss #
  • neighbourly renovation work is one loud pain in the A #
  • Congrats to Anthony Burke, new UTS head of school! Well deserved, sir! #
  • Christoph Niemann on Bio-Diversity http://bit.ly/236whH just lovely, always the right balance, whimsical reality [nytimes, via kottke] #
  • This is just fascinating – the Gervais Principle, Or The Office According to “The Office” http://bit.ly/1qJWCB [via kottke] #

Related Posts:

you should use grep

TextWranglerYou should use GREP.If you haven’t heard of grep before, it’s a very fancy pattern search tool for text editors. Think of the ‘find’ function but with more advanced capabilities such as searches for whitespace characters, end of line/file or particular character patterns.I’ve become accustomed to some of the more trivial but useful tricks available using grep, some of which might be handy to you, dear reader.Grep comes built in with a series of useful wildcards, key combinations which refer to specific text items. Examples of these include $ (end of line), [aeiou] (any vowel), r (carriage return) and s (whitespace character) which can be combined in clever ways to make text searching much easier.To put all of this into context, I’ll give an example. I have recently been doing a bit of manual HTML scraping to isolate data contained in html tags from a number of pages. For this example, what I want to do is parse a series of html documents to extract the title text contained in each document, or;


Say your example html reads like this;


To do this you could manually open each and every file, select the text to delete, delete it and then save/close the file – a very time consuming process. There must be a better way, surely! Thankfully there is – Grep. My grep tool of choice is the very handy freeware application TextWrangler, which combines the very powerful grep pattern searching abilities with an absolutely essential multi-file search option.So how can we do this quickly and painlessly?1. Open a few of the documents and check that the syntax and layout are all similar. In this case lets assume that all files follow the aforementioned layout.2. Select ‘find’ (command+F), making sure to choose the ‘use grep’ and ‘start from top’ options.3. Type in the following search pattern;Picture 2(select from start of file[any space or nonspace character]zero, one or more chatacters until is found).4. This will select all text from the start of the file, to the end of your tag. Test this as a multi-file search on all open documents to make sure your search was accurate.5. Do a find + replace with the same search pattern, making sure to replace with nothing (blank replace field).6. Next is a similar search pattern but now we’re looking for everything from the tag to the end of file.The search pattern is;Picture 3Which will select all text that is not the actual title text.7. Do the same find and replace function for all open html documents, however this time it will be useful to replace all text following the title with a single comma ‘,’. This will come in handy for automatically building lists.So now we have a lot of text documents with only the exact text we’re looking for in them, but we’re still faced with the problem of scale. We may be able to open each file and copy/paste the contents into another, but what if we needed to do this 1,000 times over? Or 10,000 times? Surely there’s a better way to do this?Happily there is. Another built-in feature of TextWrangler is the Edit – Insert – File Contents menu item. Once your html text files are stripped of all extraneous content and ‘comma-delimited’ the next step is to combine them.9. Open one document, select Edit – Insert – File Contents, then select all of the remaining documents you wish to combine into one.10. Once the text files are combined, save this new document as a new document.Done!Naturally this also applies to other text components in other types of html tags. If, for example, you had another set of tags which were nested – say if you wanted to select all items in abut not the

items, you could follow the same process. I recently used this technique to strip out all text that wasn’t usable from the cityrail.info timetable pages – something which could have taken years had it not been for the abilities of TextWrangler + Grep. Highly recommended. Click here to read more on TextWranglers’ features.J(Update: I ended up pasting in images of the search patterns because the “” character doesn’t seem to show up on the page – so copy+pasting the text to try yourself would fail, somewhat missing the point of this post. You will need to copy them out manually, which is also missing the point of this post, but it’s not a whole lot of text to copy so it isn’t too much of a problem. Shame I couldn’t resolve the html parsing, not sure why).

Related Posts:

analytical graphics

I’m really enjoying the work in Michæl.Paukner’s Flickr photostream, which includes some of the most fantastical analytical graphics, illustrating scientific/theoretical concepts. Most of his work is just gorgeous, bringing clarity and simplicity to what could otherwise be convoluted diagrams. Some of my favourites are the solar eclipse and the circular periodic table of elements;

Circular Periodic Table of ElementsCircular Periodic Table of ElementsHe’s not shy of dealing with less-rigorous concepts either, such as the hollow world theory or the ancient hebrew concept of cosmology. Definitely worth a look-see;

Hollow EarthHollow EarthThe Hundredth Monkey EffectThe Hundredth Monkey Effect

One of the reasons’ I’m enjoying this work is the dedication to taking complex ideas and presenting them in ways that they can capture an audience and convey an idea or message in a really simple, elegant way. It’s something I’ve been dealing with a lot in my work – somewhat as a side effect of observing people through the ‘eyes’ of buildings – which is the representation of data or information to the people who have a part in creating it. It’s a complex challenge and I often look to graphic designers for ideas on this very subject. Nice to see someone putting it all together in such a clean manner – not driven purely by data or image, rather a balance between the two. Message and medium, not an easy gap to bridge.via kitsunenoirJ

Related Posts: