Skip to Navigation

Muffinlabs!

Code

I enjoy reading about data visualization and processing on blrpnt.com. Today there's a post about a big data release of global temperature readings, in response to the current (and totally absurd) hockey-stick debate. Anyway, I whipped up a perl script to turn the very weird text files into somewhat more useful SQL data. Here's the script, and the data is linked below.

#! /usr/bin/perl -w
    eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
        if 0; #$running_under_some_shell

use strict;
use File::Find ();
use File::Basename;

# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.

# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name   = *File::Find::name;
*dir    = *File::Find::dir;
*prune  = *File::Find::prune;

my %columns = (
        "Number" => "id",
        "Name" => "name",
        "Country" => "country",
        "Lat" => "lat",
        "Long" => "lon",
        "Height" => "height",
        "Start year" => "start_year",
        "End year" => "end_year",
        "First Good year" => "first_good_year",
        "Source ID" => "source_id",
        "Source file" => "source_file",
        "Jones data to" => "jones_data_to",
        "Normals source" => "normals_source",
        "Normals source start year" =>"normals_start",
        "Normals source end year" => "normals_end",
        "Normals source variable code" => "normals_src_variable",
        "Normals source percent availability" => "normals_src_pct_available",
        "Normals" => "normals",
        "Standard deviations source" => "std_dev_source",
        "Standard deviations source start year" => "std_dev_start",
        "Standard deviations source end year" => "std_dev_end",
        "Standard deviations" => "std_dev"
);




sub wanted;

# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '/opt/home/colin/Downloads/yipes/');
exit;


sub wanted {
    my ($dev,$ino,$mode,$nlink,$uid,$gid);

    (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
    -f _
        && /\d\d\d\d\d\d/
        && parse_file($name);
}

sub parse_file {
        my ( $name ) = @_;
        my $id = basename($name);

        my %attr;


        open(DATA, "< $name") || die("could not open $name -- $!");
        while(<DATA>) {
                chomp;

                $attr{'id'} = $id;

                if (/=/) {
                        my ($key, $value) = split("=");
                        $key = trim($key);

                        if ( exists $columns{$key} ) {
                                $attr{$columns{$key}} = trim($value);
                                #print $columns{$key} . " -> " . $attr{$columns{$key}} . "\n";
                        }
                        else {
                                #print "$key\n";
                        }
                }
                else {
                        #1943  -2.9  -3.8  -0.1   1.8   3.7   8.4  10.5   6.7   6.9   2.7   0.8   1.6

                        my @temps = split;
                        my $year = shift @temps;

                        my $month = 1;
                        foreach my $tmp (@temps) {
                                #print "$year - $month - $tmp\n";
                                my $tmpdate = "$year-$month-01";
                                if ( $month < 10 ) {
                                        $tmpdate = "$year-0$month-01";
                                }
                                print "INSERT INTO readings SET station_id = '$id', theyear = $year, themonth = $month, thedate = '$tmpdate', val = $tmp;\n";
                                $month++;
                        }
                }
        }

                my @tmpsql;
                foreach my $key (keys %attr) {
                        my $val = quotemeta($attr{$key});
                        push @tmpsql, "$key='$val'";
                }

                print "INSERT INTO stations SET " . join(", ", @tmpsql) . ";\n";

        close DATA;
}

sub trim($)
{
        my $string = shift;
        $string =~ s/^\s+//;
        $string =~ s/\s+$//;
        return $string;
}

Everyone loves Robocop right? He's a police officer and a robot, it's the greatest pairing since PB+J. Plus he has an affection for Unicorns.

Robocop, UnicornRobocop, Unicorn

We can learn a lot from Robocop. I mean, just check out the Wikipedia entry:

Set in a crime-ridden Detroit, Michigan in the near future, RoboCop centers on a police officer who is murdered brutally and subsequently re-created as a super-human cyborg known as "RoboCop". RoboCop includes larger themes regarding the media, gentrification and human nature in addition to being an action film.

Having watched the movie 100 times during my teen years, I can assert that it definitely has all those larger themes covered. I eagerly away the Criterion Collection DVD. Robocop spawned sequels, a stage adaptation, and also a musical.

One of the best catchlines from Robocop is "I'd buy that for a dollar". From Wikipedia:

A joke among people who know RoboCop is a popular, but inane TV show with the catchphrase "I'd buy that for a dollar!", which people in the film's future universe find humorous. The star is the goofy Bixby Snyder (S.D. Nemeth). Neither the name of the show nor the character are ever revealed in the movie, although girls are heard to greet him with "Bixby!" and "Happy birthday Dave!" On the DVD commentary, Edward Neumeier comments that somehow the explanation & history of this television show was never included in the script. A deleted scene from the DVD finally reveals the show's name to be It's Not My Problem!, which is also a reference to one of the film's major themes of greed and personal satisfaction.

Allow me to introduce Bixby Snyder to Twitter. Thanks to a nifty ruby gem Twibot, anyone who mentions Robocop in a tweet will get a taste of nostalgia in return. Here's the source code if someone wants to play:

#!/usr/bin/ruby
require 'rubygems'
require 'twibot'

# Receive messages, and tweet them publicly
search "robocop", :lang => "en" do |message, params|
  tmp = client.status :get, message.id
  post_reply tmp, "I'd buy that for a dollar!"
end

You learn two things while writing a Robocop-based twitter bot:

1 - People talk about Robocop a lot more than you think on Twitter. And I don't mean the Kanye West song.
2 - Writing bots is really fun.

I haven't had much time to really focus on anything, but I have spent more time playing with Processing. My latest sketch involves repeating patterns of Bezier curves in interesting fashions. I'm trying to make something along the line of Truchet tiles, but I'm a long ways from that.

Bezier Tiles

  1. #!/usr/bin/env ruby
  2.  
  3. #
  4. # I'm lazy and want to hold onto this code snippet but don't feel like putting it in subversion
  5. #
  6.  
  7. require 'rubygems'
  8. require 'rbus'
  9.  
  10. destination = "org.freedesktop.Notifications"
  11. object_path = "/org/freedesktop/Notifications"
  12.  
  13. member = "Notify"
  14.  
  15. bus = RBus.session_bus
  16. object = bus.get_object(destination, object_path)
  17.  
  18. object.Notify("string:R-Bus",
  19.         "uint32:0",
  20.         "string:info",
  21.         "string:R-Bus Notification",
  22.         "string:A test example to see that everything works.",
  23.         "array:string:",
  24.         "dict:string:variant:",
  25.         "int32:-1")

Bash Scripting FTW

17
Apr 2009

I'm feeling nerdish today... Here's a script to run mysqldump for a few tables on a remote server and load it into your local mysql db on the fly:

#!/bin/bash
DB=$1
shift
TABLES=$*

ssh -C user@remoteserver \
"mysqldump -u username --password='pw' --skip-lock-tables $DB $TABLES > /tmp/load.sql &&; gzip -c /tmp/load.sql" \
 > /tmp/load.sql.gz \
&& gunzip -c /tmp/load.sql.gz | mysql -u user $DB

Here's what it does:
ssh -C: use compression when transmitting data over ssh - not sure this actually speeds things up or not.

mysqldump -u username --password='password' --skip-lock-tables $DB $TABLES &gt; /tmp/load.sql &amp;&amp; gzip -c /tmp/load.sql

$DB is the first parameter specified on the command-line. $TABLES is everything else, which should either be a list of the tables you want, or leave it blank to dump the whole db. Then we cat the output into /tmp/load.sql. You could actually dump it to STDOUT and save a step, but putting it into a file removes MySQL from the equation. If you have a lot of data, MySQL will block while it's being sent to your computer. A smarter person than I probably has another way of taking care of this problem.

gzip -c /tmp/load.sql compresses the file and dumps it to STDOUT, sending it down the pipe to your computer.

Meanwhile, back on your local computer...
The output is stored in /tmp/load.sql.gz. This is also technically not needed, you could be clever and put it straight into MySQL, but I find it handy to keep a copy of the SQL script around in case I want to run it more than once. gunzip -c /tmp/load.sql.gz decompresses the SQL file and prints it to STDOUT.

mysql -u user $DB loads it into the MySQL database named '$DB' And that's it!

Code on its own page here.

Syndicate content